research-article

Open access

Beyond a Joke: Dead Code Elimination Can Delete Live Code

Authors:

Lingxiao Jiang,

He JiangAuthors Info & Claims

ICSE-NIER'24: Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results

Pages 32 - 36

https://doi.org/10.1145/3639476.3639763

Published: 24 May 2024 Publication History

Abstract

Dead Code Elimination (DCE) is a fundamental compiler optimization technique that removes dead code (e.g., unreachable or reachable but whose results are unused) in the program to produce smaller or faster executables. However, since compiler optimizations are typically aggressively performed and there are complex relationships/interplay among a vast number of compiler optimizations (including DCE), it is not known whether DCE is indeed correctly performed and will only delete dead code in practice. In this study, we open a new research problem to investigate: can DCE happen to erroneously delete live code? To tackle this problem, we design a new approach named Xdead, which leverages differential testing, static binary analysis, and dynamic symbolic execution techniques, to detect miscompilation bugs caused by the erroneously deleted live code. Preliminary evaluation shows that Xdead can identify many divergent portions indicating erroneously deleted live code and finally detect two such miscompilation bugs in LLVM compilers. Our findings call for more attention to the potential issues in existing DCE implementations and more conservative strategies when designing new DCE-related compiler optimizations.

References

[1]

Aho Alfred V, Lam Monica S, Sethi Ravi, Ullman Jeffrey D, et al. 2007. Compilers-principles, techniques, and tools. pearson Education.

[2]

Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016. Generating focused random tests using directed swarm testing. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). 70--81.

Digital Library

[3]

M. Ammar Ben Khadra, Dominik Stoffel, and Wolfgang Kunz. 2020. Efficient Binary-Level Coverage Analysis. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1153--1164.

Digital Library

[4]

Jeremy Bennett. 2024. How Much Does a Compiler Cost? Retrieved 08/01/2024 from https://www.embecosm.com/2018/02/26/how-much-does-a-compiler-cost/

[5]

Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Lu Zhang. 2019. History-guided configuration diversification for compiler test-program generation. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 305--316.

Digital Library

[6]

Developers. 2024. Xdead Implementation. Retrieved 01/08/2024 from https://github.com/haoxintu/Xdead

[7]

GCC Developers. 2024. GCC Testsuite. Retrieved 01/08/2024 from https://github.com/gcc-mirror/gcc/tree/master/gcc/testsuite

[8]

LLVM Developers. 2023. LibTooling. Retrieved 01/08/2024 from https://clang.llvm.org/docs/LibTooling.html

[9]

LLVM Developers. 2024. Bug fixing commit before we reported the bug. Retrieved 01/08/2024 from https://reviews.llvm.org/D94106

[10]

LLVM Developers. 2024. LLVM Testsuite. Retrieved 01/08/2024 from https://github.com/llvm/llvm-project/tree/main/clang/test

[11]

SPEC Developers. 2024. SPEC CINT2006 Benchmarks. Retrieved 01/08/2024 from https://www.spec.org/cpu2006/CINT2006/

[12]

Karine Even-Mendoza, Cristian Cadar, and Alastair F Donaldson. 2022. CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively. Empirical Software Engineering 27, 6 (2022), 1--35.

Digital Library

[13]

Debin Gao, Michael K Reiter, and Dawn Song. 2008. Binhunt: Automatically finding semantic differences in binary programs. In International Conference on Information and Communications Security (ICICS). 238--255.

Digital Library

[14]

Godbolt. 2024. Execution results on buggy and non-buggy compilers for bug 1. Retrieved 01/08/2024 from https://godbolt.org/z/z7zxexfr1

[15]

Godbolt. 2024. Execution results on buggy and non-buggy compilers for bug 2. Retrieved 01/08/2024 from https://godbolt.org/z/xos1d64xo

[16]

He Jiang, Zhide Zhou, Zhilei Ren, Jingxuan Zhang, and Xiaochen Li. 2022. CTOS: Compiler Testing for Optimization Sequences of LLVM. IEEE Transactions on Software Engineering 48, 7 (2022), 2339--2358.

[17]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 216--226.

Digital Library

[18]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 386--399.

Digital Library

[19]

Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, and Wei Zou. 2018. α-diff: cross-version binary code similarity detection with dnn. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE). 667--678.

Digital Library

[20]

Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1--25.

Digital Library

[21]

Niels Groot Obbink, Ivano Malavolta, Gian Luca Scoccia, and Patricia Lago. 2018. An extensible approach for taming the challenges of JavaScript dead code elimination. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 291--401.

[22]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 335--346.

Digital Library

[23]

Bug Report. 2024. LLVM Issue 63121. Retrieved 01/08/2024 from https://github.com/llvm/llvm-project/issues/63121

[24]

Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. Sok:(state of) the art of war: Offensive techniques in binary analysis. In Proceedings of IEEE Symposium on Security and Privacy (S&P). 138--157.

[25]

Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In The Network and Distributed System Security Symposium (NDSS), Vol. 16. 1--16.

[26]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 849--863.

Digital Library

[27]

Yixuan Tang, He Jiang, Zhide Zhou, Xiaochen Li, Zhilei Ren, and Weiqiang Kong. 2022. Detecting Compiler Warning Defects Via Diversity-Guided Program Mutation. IEEE Transactions on Software Engineering 48, 11 (2022), 4411--4432.

[28]

Theodoros Theodoridis, Manuel Rigger, and Zhendong Su. 2022. Finding Missed Optimizations through the Lens of Dead Code Elimination. 697--709.

[29]

Haoxin Tu, He Jiang, Xiaochen Li, Zhilei Ren, Zhide Zhou, and Lingxiao Jiang. 2022. Remgen: Remanufacturing a Random Program Generator for Compiler Testing. In Proceedings of the 33rd International Symposium on Software Reliability Engineering (ISSRE). 529--540.

[30]

Haoxin Tu, He Jiang, Zhide Zhou, Yixuan Tang, Zhilei Ren, Lei Qiao, and Lingxiao Jiang. 2022. Detecting C++ Compiler Front-End Bugs via Grammar Mutation and Differential Testing. IEEE Transactions on Reliability (2022), 343 -- 357.

[31]

Sami Ullah and Heekuck Oh. 2021. BinDiff NN: Learning Distributed Representation of Assembly for Robust Binary Diffing against Semantic Differences. IEEE Transactions on Software Engineering 48, 9 (2021), 3442--3466.

[32]

Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS). 363--376.

Digital Library

[33]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 283--294.

Digital Library

Cited By

Shreya Shree SKatariya PSruthi SBelwal M(2024)Precise Lake: A Feedback and Visualization System For Optimizing Code Health2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724126(1-7)Online publication date: 24-Jun-2024
https://doi.org/10.1109/ICCCNT61001.2024.10724126

Recommendations

Partial dead code elimination
PLDI '94: Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation

A new aggressive algorithm for the elimination of partially dead code is presented, i.e., of code which is only dead on some program paths. Besides being more powerful than the usual approaches to dead code elimination, this algorithm is optimal in the ...
Partial dead code elimination

A new aggressive algorithm for the elimination of partially dead code is presented, i.e., of code which is only dead on some program paths. Besides being more powerful than the usual approaches to dead code elimination, this algorithm is optimal in the ...
Similar code detection and elimination for erlang programs
PADL'10: Proceedings of the 12th international conference on Practical Aspects of Declarative Languages

A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE-NIER'24: Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results

April 2024

127 pages

ISBN:9798400705007

DOI:10.1145/3639476

Co-chairs:
Ana Paiva,
Rui Abreu,
Robert Hierons
University of Sheffield, United Kingdom
,
Henrique Madeira
University of Coimbra Portugal
,
Program Co-chairs:
Abhik Roychoudhury,
Margaret Storey

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE-NIER'24

Sponsor:

SIGSOFT

ICSE-NIER'24: 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results

April 14 - 20, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
230
Total Downloads

Downloads (Last 12 months)230
Downloads (Last 6 weeks)63

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shreya Shree SKatariya PSruthi SBelwal M(2024)Precise Lake: A Feedback and Visualization System For Optimizing Code Health2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724126(1-7)Online publication date: 24-Jun-2024
https://doi.org/10.1109/ICCCNT61001.2024.10724126

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents