Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3375894.3375895acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrootsConference Proceedingsconference-collections
research-article

RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly

Published: 18 February 2020 Publication History

Abstract

Malware analysis is key for cybersecurity overall improvement. Analysis tools have been evolving from complete static analyzers to decompilers. Malware decompilation allows for code inspection at higher abstraction levels, easing incident response. However, the decompilation procedure has many challenges, such as opaque constructions, irreversible mappings, semantic gap bridging, among others. In this paper, we propose a new approach that leverages the human analyst expertise to overcome decompilation challenges. We name this approach "DoD---debug-oriented decompilation", in which the analyst is able to reverse engineer the malware sample on his own and to instruct the decompiler to translate selected code portions (e.g., decision branches, fingerprinting functions, payloads etc.) into high level code. With DoD, the analyst might group all decompiled pieces into new code to be analyzed by other tool, or to develop a novel malware sample from previous pieces of code and thus exercise a Proof-of-Concept (PoC). To validate our approach, we propose RevEngE, the Reverse Engineering Engine for malware decompilation and reassembly, a set of GDB extensions that intercept and introspect into executed functions to build an Intermediate Representation (IR) in real-time, enabling any-time decompilation. We evaluate RevEngE with x86 ELF binaries collected from VirusShare, and show that a new malware sample created from the decompilation of independent functions of five known malware samples is considered "clean" by all VirusTotal's AVs.

References

[1]
Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2Nd Edition). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
[2]
Amogh Akshintala, Bhushan Jain, Chia-Che Tsai, Michael Ferdman, and Donald E. Porter. 2019. x86-64 Instruction Usage among C/C++ Applications. https://aakshintala.com/papers/instrpop-systor19.pdf.
[3]
Urich Bayer, Christopher Kruegel, and Engin. Kirda. 2006. TTAnalyze: A tool for analyzing malware. In 15th European Institute for Computer Antivirus Research (EICAR 2006) Annual Conference. EICAR.
[4]
David Binkley, Nicolas Gold, and Mark Harman. 2007. An Empirical Study of Static Program Slice Size. ACM Transactions on Software Engineering Methodology 16, 2, Article 8 (April 2007). https://doi.org/10.1145/1217295.1217297
[5]
Guillaume Bonfante, Jose Fernandez, Jean-Yves Marion, Benjamin Rouxel, Fabrice Sabatier, and Aurélien Thierry. 2015. CoDisasm: Medium Scale Concatic Disassembly of Self-Modifying Binaries with Overlapping Instructions. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 745--756. https://doi.org/10.1145/2810103.2813627
[6]
Rodrigo Rubira Branco, Gabriel Negreira Barbosa, and Pedro Drimel Neto. 2012. Scientific but Not Academical Overview of Malware Anti-Debugging, Anti-Disassembly and Anti- VM Technologies. http://www.kernelhacking.com/rodrigo/docs/blackhat2012-paper.pdf.
[7]
David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J. Schwartz. 2011. BAP: A Binary Analysis Platform. In Proceedings of the 23rd International Conference on Computer Aided Verification (CAV'11). Springer-Verlag, Berlin, Heidelberg, 463--469. http://dl.acm.org/citation.cfm?id=2032305.2032342
[8]
Juan Caballero, Pongsin Poosankam, Stephen McCamant, Domagoj Babi ć, and Dawn Song. 2010. Input Generation via Decomposition and Re-stitching: Finding Bugs in Malware. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS '10). ACM, New York, NY, USA, 413--425. https://doi.org/10.1145/1866307.1866354
[9]
G. Canfora, A. Cimitile, and M. Munro. 1994. RE2: Reverse-engineering and reuse re-engineering. Journal of Software Maintenance: Research and Practice 6, 2 (1994), 53--72. https://doi.org/10.1002/smr.4360060202
[10]
Zheng Leong Chua, Shiqi Shen, Prateek Saxena, and Zhenkai Liang. 2017. Neural Nets Can Learn Function Type Signatures From Binaries. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 99--116. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/chua
[11]
Cristina Cifuentes. 1994. Reverse Compilation Techniques. Ph.D. Dissertation. Queensland University of Technology.
[12]
Cristina Cifuentes, Trent Waddington, and Mike Van Emmerik. 2001. Computer Security Analysis Through Decompilation and High-Level Debugging. In Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01) (WCRE '01). IEEE Computer Society, Washington, DC, USA, 375--. http://dl.acm.org/citation.cfm?id=832308.837157
[13]
Emanuele Cozzi, Mariano Graziano, Yanick Fratantonio, and Davide Balzarotti. 2018. Understanding Linux Malware. In 2018 IEEE Symposium on Security and Privacy (SP). 161--175. https://doi.org/10.1109/SP.2018.00054
[14]
Artem Dinaburg, Paul Royal, Monirul Sharif, and Wenke Lee. 2008. Ether: Malware Analysis via Hardware Virtualization Extensions. In Proc. 15th ACM Conf. Computer and Comm. Security (CCS '08). 51--62.
[15]
Thomas Dullien and Sebastian Porst. [n. d.]. REIL: A platform-independent intermediate representation of disassembled code for static code analysis. https://static.googleusercontent.com/media/www.zynamics.com/pt-BR//downloads/csw09.pdf.
[16]
Felix Engel, Rainer Leupers, Gerd Ascheid, Max Ferger, and Marcel Beemster. 2011. Enhanced Structural Analysis for C Code Reconstruction from IR Code. In Proceedings of the 14th International Workshop on Software and Compilers for Embedded Systems (SCOPES '11). ACM, New York, NY, USA, 21--27. https://doi.org/10.1145/1988932.1988936
[17]
Alexander Fokin, Egor Derevenetc, Alexander Chernov, and Katerina Troshina. 2011. SmartDec: Approaching C++ Decompilation. In Proceedings of the 2011 18th Working Conference on Reverse Engineering (WCRE '11). IEEE Computer Society, Washington, DC, USA, 347--356. https://doi.org/10.1109/WCRE.2011.49
[18]
Jose Manuel Rios Fonseca. 2006. Interactive Decompilation. Ph.D. Dissertation. University of Wales Swansea.
[19]
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1995. Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
[20]
GoogleSearch. 2018. GoogleSearch. https://pypi.org/project/google-search/.
[21]
André Ricardo Abed Grégio, Vitor Monte Afonso, Dario Simões Fernandes Filho, Paulo Lício de Geus, and Mario Jino. 2015. Toward a Taxonomy of Malware Behaviors. Comput. J. 58, 10 (07 2015), 2758--2777. https://doi.org/10.1093/comjnl/bxv047
[22]
Ilfak Guilfanov. [n. d.]. Decompilers and beyond. https://www.hex-rays.com/products/ida/support/ppt/decompilers_and_beyond_white_paper.pdf.
[23]
Clifford R. Hollander. 1974. A Syntax-directed Approach to Inverse Compilation. In Proceedings of the 1974 Annual ACM Conference - Volume 2 (ACM '74). ACM, New York, NY, USA, 750--750. https://doi.org/10.1145/1408800.1408926
[24]
Barron C. Housel and Maurice H. Halstead. 1974. A Methodology for Machine Language Decompilation. In Proceedings of the 1974 Annual Conference - Volume 1 (ACM '74). ACM, New York, NY, USA, 254--260. https://doi.org/10.1145/800182.810410
[25]
ISECLAB. 2010. Anubis - Malware Analysis for Unknown Binaries. https://anubis.iseclab.org/.
[26]
Kaspersky. 2010. Backdoor.Linux.Tsunami. https://threats.kaspersky.com/en/threat/Backdoor.Linux.Tsunami/.
[27]
Daniel Kästner and Stephan Wilhelm. 2002. Generic Control Flow Reconstruction from Assembly Code. In Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems: Software and Compilers for Embedded Systems (LCTES/SCOPES '02). ACM, New York, NY, USA, 46--55. https://doi.org/10.1145/513829.513839
[28]
Clemens Kolbitsch, Engin Kirda, and Christopher Kruegel. 2011. The Power of Procrastination: Detection and Mitigation of Execution-stalling Malicious Code. In Proceedings of the 18th ACM Conference on Computer and Communications Security (CCS '11). ACM, New York, NY, USA, 285--296. https://doi.org/10.1145/2046707.2046740
[29]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '05). ACM, New York, NY, USA, 190--200. https://doi.org/10.1145/1065010.1065034
[30]
Microsoft. 2007. Backdoor:Linux/Small. https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Backdoor:Linux/Small.
[31]
Jerome Miecznikowski and Laurie J. Hendren. 2002. Decompiling Java Bytecode: Problems, Traps and Pitfalls. In Proceedings of the 11th International Conference on Compiler Construction (CC '02). Springer-Verlag, London, UK, UK, 111--127. http://dl.acm.org/citation.cfm?id=647478.727938
[32]
Amdreas Moser, Christofer Kruegel, and Engin Kirda. 2007. Limits of Static Analysis for Malware Detection. In Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007). 421--430. https://doi.org/10.1109/ACSAC.2007.21
[33]
newlog. 2018. Radare2MSDN. https://github.com/newlog/r2msdn.
[34]
Kenneth Oksanen. 2011. Detecting Algorithms Using Dynamic Analysis. In Proceedings of the Ninth International Workshop on Dynamic Analysis (WODA '11). ACM, New York, NY, USA, 1--6. https://doi.org/10.1145/2002951.2002953
[35]
Roberto Paleari, Lorenzo Martignoni, Emanuele Passerini, Drew Davidson, Matt Fredrikson, Jon Giffin, and Somesh Jha. 2010. Automatic Generation of Remediation Procedures for Malware Infections. In USENIX Sec. 1. http://dl.acm.org/citation.cfm?id=1929820.1929856
[36]
Mario Polino, Andrea Scorti, Federico Maggi, and Stefano Zanero. 2015. Jackdaw: Towards Automatic Reverse Engineering of Large Datasets of Binaries. In Detection of Intrusions and Malware, and Vulnerability Assessment, Magnus Almgren, Vincenzo Gulisano, and Federico Maggi (Eds.). Springer International Publishing, Cham, 121--143.
[37]
GEF Project. 2018. GEF - GDB Enhanced Features for exploit devs & reversers. https://github.com/hugsy/gef.
[38]
PEDA Project. 2018. PEDA - Python Exploit Development Assistance for GDB. https://github.com/longld/peda.
[39]
PwnDbg Project. 2018. Pwndbg. https://github.com/pwndbg/pwndbg.
[40]
Python. 2018. pickle - Python object serialization. https://docs.python.org/3/library/pickle.html.
[41]
radare. 2019. radare. https://rada.re/.
[42]
rdbv. 2017. Translator from ASM to C, but not decompiler. Something between compiler and decompiler. https://github.com/rdbv/cisol.
[43]
Ed Robbins, Andy King, and Tom Schrijvers. 2016. From MinX to MinC: Semantics-driven Decompilation of Recursive Datatypes. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '16). ACM, New York, NY, USA, 191--203. https://doi.org/10.1145/2837614.2837633
[44]
Gabriel Rodríguez, José M. Andión, Mahmut T. Kandemir, and Juan Touriño. 2016. Trace-based Affine Reconstruction of Codes. In Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO '16). ACM, New York, NY, USA, 139--149. https://doi.org/10.1145/2854038.2854056
[45]
Edward J. Schwartz, JongHyup Lee, Maverick Woo, and David Brumley. 2013. Native x86 Decompilation Using Semantics-preserving Structural Analysis and Iterative Control-flow Structuring. In Proceedings of the 22Nd USENIX Conference on Security (SEC'13). USENIX Association, Berkeley, CA, USA, 353--368. http://dl.acm.org/citation.cfm?id=2534766.2534797
[46]
Maxime Serrano. 2013. Lecture Notes on Decompilation. https://www.cs.cmu.edu/~fp/courses/15411-f13/lectures/20-decompilation.pdf.
[47]
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Audrey Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, and Giovanni Vigna. 2016. SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In IEEE Symposium on Security and Privacy.
[48]
snowman. 2019. snowman. https://derevenets.com/.
[49]
Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena. 2008. BitBlaze: A New Approach to Computer Security via Binary Analysis. In Proceedings of the 4th International Conference on Information Systems Security (ICISS '08). Springer-Verlag, Berlin, Heidelberg, 1--25. https://doi.org/10.1007/978-3-540-89862-7_1
[50]
Murugiah Souppaya and Karen Scarfone. 2013. Guide to Malware Incident Prevention and Handling for Desktops and Laptops. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-83r1.pdf.
[51]
Greg Stitt and Frank Vahid. 2008. Binary Synthesis. ACM Trans. Des. Autom. Electron. Syst. 12, 3, Article 34 (May 2008), 30 pages. https://doi.org/10.1145/1255456.1255471
[52]
TrendMicro. 2002. ELF_RST.B. https://www.trendmicro.com/vinfo/us/threat-encyclopedia/archive/malware/elf_rst.b.
[53]
Katerina Troshina, Alexander Chernov, and Alexander Fokin. 2009. Profile-based type reconstruction for decompilation. In 2009 IEEE 17th International Conference on Program Comprehension. 263--267. https://doi.org/10.1109/ICPC.2009.5090054
[54]
Michael James van Emmerik. 2007. Static Single Assignment for Decompilation. Ph.D. Dissertation. The University of Queensland.
[55]
Julien Vanegue1, Thomas Garnier, Julio Auto, Sebastien Roy, and Rafal Lesniak. 2007. Next generation debuggers for reverse engineering. http://s.eresi-project.org/inc/articles/bheu-eresi-article-2007.pdf.
[56]
VirusShare. 2019. VirusShare. https://virusshare.com/.
[57]
VirusTotal. 2019. Detection Results. https://www.virustotal.com/gui/file/10a57a28522b27fde08bb08f8f3813edb438e7b3cebd25f25779df7a0ae41a17/detection.
[58]
Mark Weiser. 1984. Program Slicing. IEEE Transactions on Software Engineering 10, 4 (July 1984), 352--357. https://doi.org/10.1109/TSE.1984.5010248
[59]
Maria F. Weller. 1974. A Pragmatic Look at Decompilers. In Proceedings of the 1974 Annual ACM Conference - Volume 2 (ACM '74). ACM, New York, NY, USA, 753--753. https://doi.org/10.1145/1408800.1408930
[60]
Carsten. Willems, Thorsten Holz, and Felix Freiling. 2007. Toward automated dynamic malware analysis using cwsandbox. IEEE Security & Privacy 5 (2007). Issue 2.
[61]
Khaled Yakdan, Sergei Dechand, Elmar Gerhards-Padilla, and Matthew Smith. 2016. Helping Johnny to Analyze Malware: A Usability-Optimized Decompiler and Malware Analysis User Study. In 2016 IEEE Symposium on Security and Privacy (SP). 158--177. https://doi.org/10.1109/SP.2016.18
[62]
Junyuan Zeng, Yangchun Fu, Kenneth A. Miller, Zhiqiang Lin, Xiangyu Zhang, and Dongyan Xu. 2013. Obfuscation Resilient Binary Code Reuse Through Trace-oriented Programming. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (CCS '13). ACM, New York, NY, USA, 487--498. https://doi.org/10.1145/2508859.2516664
[63]
Fengwei Zhang, Kevin Leach, Angelos Stavrou, Haining Wang, and Kun Sun. 2015. Using Hardware Features for Increased Debugging Transparency. In 2015 IEEE Symposium on Security and Privacy. 55--69.
[64]
Jingbo Zhang, Rongcai Zhao, and Jianmin Pang. 2007. Parameter and Return-value Analysis of Binary Executables. In 31st Annual International Computer Software and Applications Conference (COMPSAC 2007), Vol. 1. 501--508. https://doi.org/10.1109/COMPSAC.2007.163
[65]
Xiangyu Zhang, Haifeng He, Neelam Gupta, and Rajiv Gupta. 2005. Experimental Evaluation of Using Dynamic Slices for Fault Location. In Proceedings of the Sixth International Symposium on Automated Analysis-driven Debugging (AADEBUG'05). ACM, New York, NY, USA, 33--42. https://doi.org/10.1145/1085130.1085135

Cited By

View all
  • (2024)What do malware analysts want from academia? A survey on the state-of-the-practice to guide research developmentsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678892(77-96)Online publication date: 30-Sep-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ROOTS'19: Proceedings of the 3rd Reversing and Offensive-oriented Trends Symposium
November 2019
44 pages
ISBN:9781450377751
DOI:10.1145/3375894
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 February 2020

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
  • Conselho Nacional de Desenvolvimento Científico e Tecnológico

Conference

ROOTS'19

Acceptance Rates

ROOTS'19 Paper Acceptance Rate 4 of 6 submissions, 67%;
Overall Acceptance Rate 16 of 26 submissions, 62%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)6
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)What do malware analysts want from academia? A survey on the state-of-the-practice to guide research developmentsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678892(77-96)Online publication date: 30-Sep-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media