Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Automated Patch Transplantation

Published: 31 December 2020 Publication History

Abstract

Automated program repair is an emerging area that attempts to patch software errors and vulnerabilities. In this article, we formulate and study a problem related to automated repair, namely automated patch transplantation. A patch for an error in a donor program is automatically adapted and inserted into a “similar” target program. We observe that despite standard procedures for vulnerability disclosures and publishing of patches, many un-patched occurrences remain in the wild. One of the main reasons is the fact that various implementations of the same functionality may exist and, hence, published patches need to be modified and adapted. In this article, we therefore propose and implement a workflow for transplanting patches. Our approach centers on identifying patch insertion points, as well as namespaces translation across programs via symbolic execution. Experimental results to eliminate five classes of errors highlight our ability to fix recurring vulnerabilities across various programs through transplantation. We report that in 20 of 24 fixing tasks involving eight application subjects mostly involving file processing programs, we successfully transplanted the patch and validated the transplantation through differential testing. Since the publication of patches make an un-patched implementation more vulnerable, our proposed techniques should serve a long-standing need in practice.

References

[1]
Agostino Sarubbo. 2019. Agostino’s blog. Retrieved February 24, 2019, from https://blogs.gentoo.org/ago
[2]
LLVM Developer Group. 2019. The Clang Project. Retrieved March 11, 2019, from https://clang.llvm.org.
[3]
Microsoft Research. 2019. The Z3 Theorem Prover. Retrieved March 11, 2019, from https://github.com/Z3Prover/z3.
[4]
Docker Inc. 2020. The Docker Project. Retrieved March 5, 2019, from https://www.docker.com/.
[5]
Michael D. Adams and Faouzi Kossentini. 2000. JasPer: A software-based JPEG-2000 codec implementation. In Proceedings 2000 International Conference on Image Processing (Cat. No. 00CH37101), Vol. 2. IEEE, 53--56.
[6]
Taweesup Apiwattanapong, Alessandro Orso, and Mary Jean Harrold. 2007. JDiff: A differencing technique and tool for object-oriented programs. In 29th IEEE/ACM International Conference on Automated Software Engineering (ASE'14). 3--36.
[7]
Tiffany Bao, Ruoyu Wang, Yan Shoshitaishvili, and David Brumley. 2017. Your exploit is mine: Automatic shellcode transplant for remote exploits. In 2017 IEEE Symposium on Security and Privacy (SP’07). IEEE, 824--839.
[8]
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA'18). ACM, 257--269.
[9]
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). USENIX Association, Berkeley, CA, 209--224. http://dl.acm.org/citation.cfm?id=1855741.1855756.
[10]
Zakir Durumeric, Frank Li, James Kasten, Johanna Amann, Jethro Beekman, Mathias Payer, Nicolas Weaver, David Adrian, Vern Paxson, Michael Bailey, et al. 2014. The matter of heartbleed. In Proceedings of the 2014 Conference on Internet Measurement Conference (IMC'14). ACM, 475--488.
[11]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ACM/IEEE International Conference on Automated Software Engineering (ASE’14), Vasteras, Sweden - September 15-19, 2014. 313--324.
[12]
Beat Fluri, Michael Wuersch, Martin Pinzger, and Harald Gall. 2007. Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Transactions on Software Engineering 33, 11 (2007), 725--743.
[13]
Zachary P. Fry, Bryan Landau, and Westley Weimer. 2012. A human study of patch maintainability. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (ISSTA’12). ACM, New York, NY, 177--187.
[14]
V. U. Gomez, S. Ducasse, and T. D’Hondt. 2010. Visually supporting source code changes integration: The torch dashboard. In 2010 17th Working Conference on Reverse Engineering. 55--64.
[15]
Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair. Communications of the ACM 62, 12 (2019), 56--65.
[16]
Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, and Wenyun Zhao. 2018. ClDiff: Generating concise linked code differences. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE'18). ACM, 679--690.
[17]
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'18). ACM, 298--309.
[18]
Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. DECKARD: Scalable and accurate tree-based detection of code clones. In Proceedings of the 29th International Conference on Software Engineering (ICSE’07). IEEE Computer Society, 96--105.
[19]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In Proceedings of the 35th International Conference on Software Engineering (ICSE'13). IEEE Press, 802--811.
[20]
Miryung Kim and David Notkin. 2009. Discovering and representing systematic code changes. In 2009 IEEE 31st International Conference on Software Engineering (ICSE'09). IEEE, 309--319.
[21]
Janusz Laski and Wojciech Szermer. 1992. Identification of program modifications and its applications in software maintenance. In Proceedings Conference on Software Maintenance 1992. IEEE, 282--290.
[22]
Xuan Bach D. Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER’16), Vol. 1. IEEE, 213--224.
[23]
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 3--13. http://dl.acm.org/citation.cfm?id=2337223.2337225.
[24]
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38, 1 (Jan. 2012), 54--72.
[25]
Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic inference of code transforms for patch generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). Association for Computing Machinery, New York, NY, 727--739.
[26]
Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). ACM, New York, NY, 298--312.
[27]
Yanxin Lu, Swarat Chaudhuri, Chris Jermaine, and David Melski. 2018. Program splicing. In Proceedings of the 40th International Conference on Software Engineering (ICSE'18). ACM, 338--349.
[28]
Alexandru Marginean, Earl T. Barr, Mark Harman, and Yue Jia. 2015. Automated transplantation of call graph and layout features into Kate. In International Symposium on Search Based Software Engineering (ICSE'13). Springer, 262--268.
[29]
Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in Java: A large-scale experiment on the Defects4J dataset. Empirical Software Engineering (Empirical Softw. Engg) 22, 4 (Aug. 2017), 1936--1964.
[30]
Matias Martinez and Martin Monperrus. 2016. ASTOR: A program repair library for Java (demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 441--444.
[31]
Sergey Mechtaev, Xiang Gao, Shin Hwei Tan, and Abhik Roychoudhury. 2018. Test-equivalence analysis for automatic patch generation. ACM Transactions on Software Engineering and Methodology (TOSEM) 27, 4 (2018).
[32]
Sergey Mechtaev, Manh-Dung Nguyen, Yannic Noller, Lars Grunske, and Abhik Roychoudhury. 2018. Semantic program repair using a reference implementation. In Proceedings of the 40th International Conference on Software Engineering (ICSE'18). ACM, 129--139.
[33]
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. Directfix: Looking for simple program repairs. In Proceedings of the 37th International Conference on Software Engineering (ICSE'15), Volume 1. IEEE Press, 448--458.
[34]
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In IEEE/ACM 38th International Conference on Software Engineering (ICSE’16). IEEE, 691–701.
[35]
Na Meng, Miryung Kim, and Kathryn S. McKinley. 2011. Systematic editing: Generating program transformations from an example. In ACM SIGPLAN Notices, Vol. 46. ACM, 329--342.
[36]
Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. LASE: Locating and applying systematic edits by learning from examples. In Proceedings of the 2013 International Conference on Software Engineering (ICSE'13). IEEE Press, 502--511.
[37]
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 772--781. http://dl.acm.org/citation.cfm?id=2486788.2486890.
[38]
Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller. 2008. Documenting and automating collateral evolutions in Linux device drivers. In ACM SIGOPS/EuroSys European Conference on Computer Systems (Eurosys’08).
[39]
Yoann Padioleau, Julia Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in Linux device drivers. In Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008 (Eurosys’08). ACM, New York, NY, 247--260.
[40]
Nicolas Palix, Gael Thomas, Suman Saha, Christophe Calvès, Julia Lawall, and Gilles Muller. 2011. Faults in Linux: Ten years later. In 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'11).
[41]
Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard. 2009. Automatically patching errors in deployed software. In 22nd ACM Symposium on Operating Systems Principles (SOSP'09). 87--102.
[42]
Nam H. Pham, Tung Thanh Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2010. Detection of recurring software vulnerabilities. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE’10). ACM, New York, NY, 447--456.
[43]
Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA’15). Association for Computing Machinery, New York, NY, 24--36.
[44]
Shruti Raghavan, Rosanne Rohana, David Leon, Andy Podgurski, and Vinay Augustine. 2004. Dex: A semantic-graph differencing tool for studying changes in large code bases. In 20th IEEE International Conference on Software Maintenance (ICSM'04). Proceedings. IEEE, 188--197.
[45]
Baishakhi Ray, Miryung Kim, Suzette Person, and Neha Rungta. 2013. Detecting and characterizing semantic inconsistencies in ported code. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE, 367--377.
[46]
Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning syntactic program transformations from examples. In Proceedings of the 39th International Conference on Software Engineering (ICSE'17). IEEE Press, 404--415.
[47]
C. K. Roy and J. R. Cordy. 2008. An empirical study of function clones in open source software. In 2008 15th Working Conference on Reverse Engineering. 81--90.
[48]
Koushik Sen. 2007. Concolic testing. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). Association for Computing Machinery, New York, NY, 571--572.
[49]
Shodan. 2016. Devices Vulnerable to Heartbleed. Retrieved September 14, 2018, from https://www.shodan.io/report/89bnfUyJ
[50]
Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard. 2017. CodeCarbonCopy. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE'17). ACM, 95--105.
[51]
Stelios Sidiroglou-Douskos, Eric Lahtinen, Fan Long, and Martin Rinard. 2015. Automatic error elimination by horizontal code transfer across multiple applications. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’15). ACM, New York, NY, 43--54.
[52]
Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? Overfitting in automated program repair. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE'15). 532--543.
[53]
Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? Overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 532--543.
[54]
Shin Hwei Tan, Zhen Dong, Xiang Gao, and Abhik Roychoudhury. 2018. Repairing crashes in Android apps. In Proceedings of the 40th International Conference on Software Engineering (ICSE’18). ACM, New York, NY, 187--198.
[55]
Shin Hwei Tan and Abhik Roychoudhury. 2015. Relifix: Automated repair of software regressions. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE’15). IEEE Press, Piscataway, NJ, 471--482. http://dl.acm.org/citation.cfm?id=2818754.2818813
[56]
Shin Hwei Tan, Hiroaki Yoshida, Mukul R. Prasad, and Abhik Roychoudhury. 2016. Anti-patterns in search-based program repair. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE'16). ACM, 727--738.
[57]
Ferdian Thung, Xuan-Bach D. Le, David Lo, and Julia Lawall. 2016. Recommending code changes for automatic backporting of Linux device drivers. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME’16). IEEE, 222--232.
[58]
Yuan Tian, Julia Lawall, and David Lo. 2012. Identifying Linux bug fixing patches. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 386--396. http://dl.acm.org/citation.cfm?id=2337223.2337269
[59]
Rijnard van Tonder and Claire Le Goues. 2018. Static automated program repair for heap properties. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE’18). IEEE, 151--162.
[60]
Westley Weimer, Zachary P. Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE Press, Piscataway, NJ, 356--366.
[61]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In IEEE/ACM International Conference on Software Engineering (ICSE’09).
[62]
Qi Xin and Steven P. Reiss. 2017. Identifying test-suite-overfitted patches through test case generation. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’17). ACM, New York, NY, 226--236.
[63]
Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise condition synthesis for program repair. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). IEEE Press, Piscataway, NJ, 416--426.
[64]
J. Xuan, M. Martinez, F. DeMarco, M. Clement, S. Lamelas Marcote, T. Durieux, D. Le Berre, and M. Monperrus. 2016. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Transactions on Software Engineering PP, 99 (2016), 1--1.
[65]
Wuu Yang. 1991. Identifying syntactic differences between two programs. Software: Practice and Experience 21, 7 (1991), 739--755.
[66]
Tianyi Zhang and Miryung Kim. 2017. Automated transplantation and differential testing for clones. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). IEEE Press, Piscataway, NJ, 665--676.

Cited By

View all
  • (2024)Evolving Paradigms in Automated Program Repair: Taxonomy, Challenges, and OpportunitiesACM Computing Surveys10.1145/369645057:2(1-43)Online publication date: 10-Oct-2024
  • (2024)Automating Zero-Shot Patch Porting for Hard ForksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652134(363-375)Online publication date: 11-Sep-2024
  • (2024)CodeArt: Better Code Models by Attention Regularization When Symbols Are LackingProceedings of the ACM on Software Engineering10.1145/36437521:FSE(562-585)Online publication date: 12-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 30, Issue 1
Continuous Special Section: AI and SE
January 2021
444 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3446626
  • Editor:
  • Mauro Pezzè
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 December 2020
Accepted: 01 July 2020
Revised: 01 June 2020
Received: 01 November 2019
Published in TOSEM Volume 30, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Program repair
  2. code transplantation
  3. dynamic program analysis
  4. patch transplantation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Research Foundation Singapore
  • Natural Science Foundation of Guangdong Province
  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)105
  • Downloads (Last 6 weeks)10
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Evolving Paradigms in Automated Program Repair: Taxonomy, Challenges, and OpportunitiesACM Computing Surveys10.1145/369645057:2(1-43)Online publication date: 10-Oct-2024
  • (2024)Automating Zero-Shot Patch Porting for Hard ForksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652134(363-375)Online publication date: 11-Sep-2024
  • (2024)CodeArt: Better Code Models by Attention Regularization When Symbols Are LackingProceedings of the ACM on Software Engineering10.1145/36437521:FSE(562-585)Online publication date: 12-Jul-2024
  • (2023)PEM: Representing Binary Program Semantics for Similarity Analysis via a Probabilistic Execution ModelProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616301(401-412)Online publication date: 30-Nov-2023
  • (2023)Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction DeemphasisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598121(1106-1118)Online publication date: 12-Jul-2023
  • (2023)Enhancing OSS Patch Backporting with SemanticsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623188(2366-2380)Online publication date: 15-Nov-2023
  • (2023) FS 3 change : A Scalable Method for Change Pattern Mining IEEE Transactions on Software Engineering10.1109/TSE.2023.3269500(1-14)Online publication date: 2023
  • (2023)Analysis and Propagation of Feature Revisions in Preprocessor-based Software Product Lines2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00035(284-295)Online publication date: Mar-2023
  • (2023)Software Testing Research Challenges: An Industrial Perspective2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00008(1-10)Online publication date: Apr-2023
  • (2023)ScaleFix: An Automated Repair of UI Scaling Accessibility Issues in Android Applications2023 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58846.2023.00025(147-159)Online publication date: 1-Oct-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media