Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3639476.3639763acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Beyond a Joke: Dead Code Elimination Can Delete Live Code

Published: 24 May 2024 Publication History

Abstract

Dead Code Elimination (DCE) is a fundamental compiler optimization technique that removes dead code (e.g., unreachable or reachable but whose results are unused) in the program to produce smaller or faster executables. However, since compiler optimizations are typically aggressively performed and there are complex relationships/interplay among a vast number of compiler optimizations (including DCE), it is not known whether DCE is indeed correctly performed and will only delete dead code in practice. In this study, we open a new research problem to investigate: can DCE happen to erroneously delete live code? To tackle this problem, we design a new approach named Xdead, which leverages differential testing, static binary analysis, and dynamic symbolic execution techniques, to detect miscompilation bugs caused by the erroneously deleted live code. Preliminary evaluation shows that Xdead can identify many divergent portions indicating erroneously deleted live code and finally detect two such miscompilation bugs in LLVM compilers. Our findings call for more attention to the potential issues in existing DCE implementations and more conservative strategies when designing new DCE-related compiler optimizations.

References

[1]
Aho Alfred V, Lam Monica S, Sethi Ravi, Ullman Jeffrey D, et al. 2007. Compilers-principles, techniques, and tools. pearson Education.
[2]
Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016. Generating focused random tests using directed swarm testing. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). 70--81.
[3]
M. Ammar Ben Khadra, Dominik Stoffel, and Wolfgang Kunz. 2020. Efficient Binary-Level Coverage Analysis. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1153--1164.
[4]
Jeremy Bennett. 2024. How Much Does a Compiler Cost? Retrieved 08/01/2024 from https://www.embecosm.com/2018/02/26/how-much-does-a-compiler-cost/
[5]
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Lu Zhang. 2019. History-guided configuration diversification for compiler test-program generation. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 305--316.
[6]
Developers. 2024. Xdead Implementation. Retrieved 01/08/2024 from https://github.com/haoxintu/Xdead
[7]
GCC Developers. 2024. GCC Testsuite. Retrieved 01/08/2024 from https://github.com/gcc-mirror/gcc/tree/master/gcc/testsuite
[8]
LLVM Developers. 2023. LibTooling. Retrieved 01/08/2024 from https://clang.llvm.org/docs/LibTooling.html
[9]
LLVM Developers. 2024. Bug fixing commit before we reported the bug. Retrieved 01/08/2024 from https://reviews.llvm.org/D94106
[10]
LLVM Developers. 2024. LLVM Testsuite. Retrieved 01/08/2024 from https://github.com/llvm/llvm-project/tree/main/clang/test
[11]
SPEC Developers. 2024. SPEC CINT2006 Benchmarks. Retrieved 01/08/2024 from https://www.spec.org/cpu2006/CINT2006/
[12]
Karine Even-Mendoza, Cristian Cadar, and Alastair F Donaldson. 2022. CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively. Empirical Software Engineering 27, 6 (2022), 1--35.
[13]
Debin Gao, Michael K Reiter, and Dawn Song. 2008. Binhunt: Automatically finding semantic differences in binary programs. In International Conference on Information and Communications Security (ICICS). 238--255.
[14]
Godbolt. 2024. Execution results on buggy and non-buggy compilers for bug 1. Retrieved 01/08/2024 from https://godbolt.org/z/z7zxexfr1
[15]
Godbolt. 2024. Execution results on buggy and non-buggy compilers for bug 2. Retrieved 01/08/2024 from https://godbolt.org/z/xos1d64xo
[16]
He Jiang, Zhide Zhou, Zhilei Ren, Jingxuan Zhang, and Xiaochen Li. 2022. CTOS: Compiler Testing for Optimization Sequences of LLVM. IEEE Transactions on Software Engineering 48, 7 (2022), 2339--2358.
[17]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 216--226.
[18]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 386--399.
[19]
Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, and Wei Zou. 2018. α-diff: cross-version binary code similarity detection with dnn. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE). 667--678.
[20]
Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1--25.
[21]
Niels Groot Obbink, Ivano Malavolta, Gian Luca Scoccia, and Patricia Lago. 2018. An extensible approach for taming the challenges of JavaScript dead code elimination. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 291--401.
[22]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 335--346.
[23]
Bug Report. 2024. LLVM Issue 63121. Retrieved 01/08/2024 from https://github.com/llvm/llvm-project/issues/63121
[24]
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. Sok:(state of) the art of war: Offensive techniques in binary analysis. In Proceedings of IEEE Symposium on Security and Privacy (S&P). 138--157.
[25]
Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In The Network and Distributed System Security Symposium (NDSS), Vol. 16. 1--16.
[26]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 849--863.
[27]
Yixuan Tang, He Jiang, Zhide Zhou, Xiaochen Li, Zhilei Ren, and Weiqiang Kong. 2022. Detecting Compiler Warning Defects Via Diversity-Guided Program Mutation. IEEE Transactions on Software Engineering 48, 11 (2022), 4411--4432.
[28]
Theodoros Theodoridis, Manuel Rigger, and Zhendong Su. 2022. Finding Missed Optimizations through the Lens of Dead Code Elimination. 697--709.
[29]
Haoxin Tu, He Jiang, Xiaochen Li, Zhilei Ren, Zhide Zhou, and Lingxiao Jiang. 2022. Remgen: Remanufacturing a Random Program Generator for Compiler Testing. In Proceedings of the 33rd International Symposium on Software Reliability Engineering (ISSRE). 529--540.
[30]
Haoxin Tu, He Jiang, Zhide Zhou, Yixuan Tang, Zhilei Ren, Lei Qiao, and Lingxiao Jiang. 2022. Detecting C++ Compiler Front-End Bugs via Grammar Mutation and Differential Testing. IEEE Transactions on Reliability (2022), 343 -- 357.
[31]
Sami Ullah and Heekuck Oh. 2021. BinDiff NN: Learning Distributed Representation of Assembly for Robust Binary Diffing against Semantic Differences. IEEE Transactions on Software Engineering 48, 9 (2021), 3442--3466.
[32]
Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS). 363--376.
[33]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 283--294.

Cited By

View all
  • (2024)Precise Lake: A Feedback and Visualization System For Optimizing Code Health2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724126(1-7)Online publication date: 24-Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-NIER'24: Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results
April 2024
127 pages
ISBN:9798400705007
DOI:10.1145/3639476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2024

Check for updates

Author Tags

  1. reliability
  2. software testing
  3. program analysis
  4. symbolic execution

Qualifiers

  • Research-article

Conference

ICSE-NIER'24
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)230
  • Downloads (Last 6 weeks)63
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Precise Lake: A Feedback and Visualization System For Optimizing Code Health2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724126(1-7)Online publication date: 24-Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media