Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

Published: 21 January 2021 Publication History

Abstract

Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible, because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyze, are themselves dynamically mutating objects. Nevertheless, assembling code at run-time by manipulating strings, such as by eval in JavaScript, has been always strongly discouraged, since it is often recognized that “eval is evil,” leading static analyzers to not consider such statements or ignoring their effects. Unfortunately, the lack of formal approaches to analyze string-to-code statements pose a perfect habitat for malicious code, that is surely evil and do not respect good practice rules, allowing them to hide malicious intents as strings to be converted to code and making static analyses blind to the real malicious aim of the code. Hence, the need to handle string-to-code statements approximating what they can execute, and therefore allowing the analysis to continue (even in the presence of dynamically generated program statements) with an acceptable degree of precision, should be clear. To reach this goal, we propose a static analysis allowing us to collect string values and to soundly over-approximate and analyze the code potentially executed by a string-to-code statement.

References

[1]
Hynek Petrak [n.d.]. Hynek Petrak JS Malware collection. Retrieved from https://github.com/HynekPetrak/javascript-malware-collection.
[2]
J. (D.) An, A. Chaudhuri, J. S. Foster, and M. Hicks. 2011. Dynamic inference of static types for Ruby. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’11), T. Ball and M. Sagiv (Eds.). ACM, 459--472.
[3]
B. Anckaert, M. Madou, and K. De Bosschere. 2006. A model for self-modifying code. In Proceedings of the International Workshop on Information Hiding (LNCS), J. Camenisch, C. S. Collberg, N. F. Johnson, and P. Sallee (Eds.), Vol. 4437. Springer, 232--248.
[4]
Vincenzo Arceri and Sergio Maffeis. 2017. Abstract domains for type juggling. Electr. Notes Theor. Comput. Sci. 331 (2017), 41--55.
[5]
Vincenzo Arceri and Isabella Mastroeni. 2019. An automata-based abstract semantics for string manipulation languages. In Proceedings of the 7th International Workshop on Verification and Program Transformation, (VPT@Programming’19). 19--33.
[6]
Vincenzo Arceri and Isabella Mastroeni. 2020. A sound abstract interpreter for dynamic code. In Proceedings of the 35th ACM/SIGAPP Symposium on Applied Computing (SAC’20), Chih-Cheng Hung, Tomás Cerný, Dongwan Shin, and Alessio Bechini (Eds.). ACM, 1979--1988.
[7]
Vincenzo Arceri, Isabella Mastroeni, and Sunyi Xu. 2020. Static analysis for ECMAScript string manipulation programs. Appl. Sci. 10 (2020), 3525.
[8]
Vincenzo Arceri, Martina Olliaro, Agostino Cortesi, and Isabella Mastroeni. 2019. Completeness of abstract domains for string analysis of JavaScript programs. In Proceedings of the 16th International Colloquium on Theoretical Aspects of Computing (ICTAC’19) (Lecture Notes in Computer Science), Robert M. Hierons and Mohamed Mosbah (Eds.), Vol. 11884. Springer, 255--272.
[9]
M. Balliu and I. Mastroeni. 2010. A weakest precondition approach to robustness. Trans. Comput. Sci. 10 (2010), 261--297.
[10]
Al Bessey, Ken Block, Benjamin Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles-Henri Gros, Asya Kamsky, Scott McPeak, and Dawson R. Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66--75.
[11]
P. Biggar and D. Gregg. 2009. Static Analysis of Dynamic Scripting Languages. Technical Report. Department of Computer Science, Trinity College Dublin.
[12]
Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 241--250.
[13]
Janusz A. Brzozowski. 1964. Derivatives of regular expressions. J. ACM 11, 4 (1964), 481--494.
[14]
Samuele Buro and Isabella Mastroeni. 2018. Abstract code injection—A semantic approach based on abstract non-interference. In Proceedings of the 19th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’18) (Lecture Notes in Computer Science), Isil Dillig and Jens Palsberg (Eds.), Vol. 10747. Springer, 116--137.
[15]
H. Cai, Z. Shao, and A. Vaynberg. 2007. Certified self-modifying code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07), J. Ferrante and K. S. McKinley (Eds.). ACM, 66--77.
[16]
Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. 2003. Precise analysis of string expressions. In Proceedings of the 10th International Symposium on Static Analysis (SAS’03) (Lecture Notes in Computer Science), Radhia Cousot (Ed.), Vol. 2694. Springer, 1--18.
[17]
R. Chugh, J. A. Meister, R. Jhala, and S. Lerner. 2009. Staged information flow for JavaScript. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09), M. Hind and A. Diwan (Eds.). ACM, 50--62.
[18]
P. Cousot. 1997. Types as abstract interpretations (invited paper). In Proceedings of the 24th ACM Symposium on Principles of Programming Languages (POPL’97). ACM Press, 316--331.
[19]
P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM Symposium on Principles of Programming Languages (POPL’77). ACM Press, 238--252.
[20]
P. Cousot and R. Cousot. 1992. Abstract interpretation frameworks. J. Logic Comput. 2, 4 (1992), 511--547.
[21]
P. Cousot and R. Cousot. 1995. Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In Proceedings of the 7th ACM Conference on Functional Programming Languages and Computer Architecture. ACM Press, New York, NY, 170--181.
[22]
P. Cousot and N. Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL’78). ACM Press, 84--96.
[23]
Charlie Curtsinger, Benjamin Livshits, Benjamin G. Zorn, and Christian Seifert. 2011. ZOZZLE: Fast and precise in-browser javascript malware detection. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. http://static.usenix.org/events/sec11/tech/full_papers/Curtsinger.pdf
[24]
Mila Dalla Preda, Roberto Giacobazzi, Arun Lakhotia, and Isabella Mastroeni. 2015. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. ACM SIGPLAN Notices 50, 1 (2015), 329--341.
[25]
M. Davis, R. Sigal, and E. J. Weyuker. 1994. Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science (Computer Science and Scientific Computing), 2nd ed. Elsevier.
[26]
Kyung-Goo Doh, Hyunha Kim, and David A. Schmidt. 2009. Abstract parsing: Static analysis of dynamically generated string output using LR-parsing technology. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09) (Lecture Notes in Computer Science), Jens Palsberg and Zhendong Su (Eds.), Vol. 5673. Springer, 256--272.
[27]
S. Drape, C. Thomborson, and A. Majumdar. 2007. Specifying imperative data obfuscations. In Proceedings of the Conference on Information Security (IS’07) (Lecture Notes in Computer Science), J. A. Garay, A. K. Lenstra, M. Mambo, and R. Peralta (Eds.), Vol. 4779. Springer Verlag, 299--314.
[28]
V. D’Silva. 2006. Widening for Automata. Diploma Thesis, Institut Fur Informatick, Universitat Zurich.
[29]
François Gauthier, Behnaz Hassanshahi, and Alexander Jordan. 2018. AFFOGATO: Runtime detection of injection attacks for Node.js. In Proceedings of the ISSTA/ECOOP Workshops (ISSTA’18), Julian Dolby, William G. J. Halfond, and Ashish Mishra (Eds.). ACM, 94--99.
[30]
R. Giacobazzi. 1998. Abductive analysis of modular logic programs. J. Logic Comput. 8, 4 (1998), 457--484.
[31]
R. Giacobazzi, N. D. Jones, and I. Mastroeni. 2012. Obfuscation by partial evaluation of distorted interpreters. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM’12), O. Kiselyov and S. Thompson (Eds.). ACM Press, 63--72.
[32]
Roberto Giacobazzi and Isabella Mastroeni. 2010. A proof system for abstract non-interference. J. Log. Comput. 20, 2 (2010), 449--479.
[33]
Roberto Giacobazzi and Isabella Mastroeni. 2012. Making abstract interpretation incomplete: Modeling the potency of obfuscation. In Proceedings of the 19th International Symposium on Static Analysis (SAS’12) (Lecture Notes in Computer Science), Antoine Miné and David Schmidt (Eds.), Vol. 7460. Springer, 129--145.
[34]
Roberto Giacobazzi and Isabella Mastroeni. 2018. Abstract non-interference: A unifying framework for weakening information-flow. ACM Trans. Priv. Secur. 21, 2 (2018), 9:1--9:31.
[35]
Nevin Heintze and Joxan Jaffar. 1994. Set constraints and set-based analysis. In Proceedings of the 2nd International Workshop on Principles and Practice of Constraint Programming (PPCP’94) (Lecture Notes in Computer Science), Alan Borning (Ed.), Vol. 874. Springer, 281--298.
[36]
Pieter Hooimeijer, Benjamin Livshits, David Molnar, Prateek Saxena, and Margus Veanes. 2011. Fast and precise sanitizer analysis with BEK. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. Retrieved from http://static.usenix.org/events/sec11/tech/full_papers/Hooimeijer.pdf.
[37]
Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the eval that men do. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’12), Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 34--44.
[38]
Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type analysis for JavaScript. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09). 238--255.
[39]
R. Karim, F. Tip, A. Sochurkova, and K. Sen. 2018. Platform-independent dynamic taint analysis for JavaScript. IEEE Trans. Softw. Eng. 46, 12 (2020), 1364--1379.
[40]
Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. 2014. JSAI: A static analysis platform for JavaScript. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). 121--132.
[41]
Hyunha Kim, Kyung-Goo Doh, and David A. Schmidt. 2013. Static validation of dynamically generated HTML documents based on abstract parsing and semantic processing. In Proceedings of the 20th International Symposium on Static Analysis (SAS’13) (Lecture Notes in Computer Science), Francesco Logozzo and Manuel Fähndrich (Eds.), Vol. 7935. Springer, 194--214.
[42]
Hongki Lee, Sooncheol Won, Joonho Jin, Junhee Cho, and Sukyoung Ryu. 2012. SAFE: Formal specification and implementation of a scalable analysis framework for ECMAScript. In Proceedings of the International Workshop on Foundations of Object-Oriented Languages. ACM.
[43]
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. Commun. ACM 58, 2 (2015), 44--46.
[44]
Isabella Mastroeni and Durica Nikolic. 2010. Abstract program slicing: From theory towards an implementation. In Proceedings of the 12th International Conference on Formal Engineering Methods (ICFEM’10) (Lecture Notes in Computer Science), Jin Song Dong and Huibiao Zhu (Eds.), Vol. 6447. Springer, 452--467.
[45]
Isabella Mastroeni and Damiano Zanardini. 2017. Abstract program slicing: An abstract interpretation-based approach to program slicing. ACM Trans. Comput. Log. 18, 1 (2017), 7:1--7:58.
[46]
N. Mavrogiannopoulos, N. Kisserli, and B. Preneel. 2011. A taxonomy of self-modifying code for obfuscation. Comput. Secur. 30, 8 (2011), 679--691.
[47]
Fadi Meawad, Gregor Richards, Floréal Morandat, and Jan Vitek. 2012. Eval begone!: Semi-automated removal of eval from javascript programs. In Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’12), Gary T. Leavens and Matthew B. Dwyer (Eds.). ACM, 607--620.
[48]
Yasuhiko Minamide. 2005. Static approximation of dynamically generated Web pages. In Proceedings of the 14th International Conference on World Wide Web (WWW’05), Allan Ellis and Tatsuya Hagino (Eds.). ACM, 432--441.
[49]
Anders Møller. 2015. Static analysis of JavaScript. In Proceedings of the 22nd International Symposium on Static Analysis (SAS’15).
[50]
Luca Negrini, Vincenzo Arceri, Pietro Ferrara, and Agostino Cortesi. 2020. Twinning automata and regular expressions for string static analysis. Retrieved from https://arxiv:cs.SE/2006.02715.
[51]
Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. 1999. Principles of Program Analysis. Springer.
[52]
Changhee Park and Sukyoung Ryu. 2015. Scalable and precise static analysis of JavaScript applications via loop-sensitivity. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP’15). 735--756.
[53]
Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek. 2011. The eval that men do—A large-scale study of the use of eval in JavaScript applications. In Proceedings of the 25th European Conference on Object-Oriented Programming (ECOOP’11) (Lecture Notes in Computer Science), Mira Mezini (Ed.), Vol. 6813. Springer, 52--78.
[54]
Helmut Seidl, Reinhard Wilhelm, and Sebastian Hack. 2012. Compiler Design—Analysis and Transformation. Springer.
[55]
Cristian-Alexandru Staicu, Michael Pradel, and Benjamin Livshits. 2018. SYNODE: Understanding and automatically preventing injection attacks on NODE.JS. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). The Internet Society. Retrieved from http://wp.internetsociety.org/ndss/wp-content/uploads/sites/25/2018/02/ndss2018_07A-2_Staicu_paper.pdf.
[56]
Peter Thiemann. 2005. Grammar-based analysis of string expressions. In Proceedings of the ACM SIGPLAN International Workshop on Types in Languages Design and Implementation (TLDI’05), J. Gregory Morrisett and Manuel Fähndrich (Eds.). ACM, 59--70.
[57]
Arnaud Venet. 1999. Automatic analysis of pointer aliasing for untyped programs. Sci. Comput. Program. 35, 2 (1999), 223--248.
[58]
Junjie Wang, Yinxing Xue, Yang Liu, and Tian Huat Tan. 2015. JSDC: A hybrid approach for JavaScript malware detection and classification. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIA CCS’15), Feng Bao, Steven Miller, Jianying Zhou, and Gail-Joon Ahn (Eds.). ACM, 109--120.
[59]
X. Wang, Y. Jhi, S. Zhu, and P. Liu. 2008. STILL: Exploit code detection via static taint and initialization analyses. In Proceedings of the Annual Computer Security Applications Conference (ACSAC’08). IEEE Computer Society, 289--298.
[60]
Yichen Xie and Alex Aiken. 2006. Static detection of security vulnerabilities in scripting languages. In Proceedings of the 15th USENIX Security Symposium, Angelos D. Keromytis (Ed.). USENIX Association. Retrieved from https://www.usenix.org/conference/15th-usenix-security-symposium/static-detection-security-vulnerabilities-scripting.
[61]
Yinxing Xue, Junjie Wang, Yang Liu, Hao Xiao, Jun Sun, and Mahinthan Chandramohan. 2015. Detection and classification of malicious JavaScript via attack behavior modelling. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15), Michal Young and Tao Xie (Eds.). ACM, 48--59.
[62]
Fang Yu, Muath Alkhalaf, and Tevfik Bultan. 2011. Patching vulnerabilities with sanitization synthesis. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11), Richard N. Taylor, Harald C. Gall, and Nenad Medvidovic (Eds.). ACM, 251--260.

Cited By

View all
  • (2024)Enhancing Malicious URL Detection: A Novel Framework Leveraging Priority Coefficient and Feature EvaluationIEEE Access10.1109/ACCESS.2024.341233112(85001-85026)Online publication date: 2024
  • (2024) Tarsis : An effective automata‐based abstract domain for string analysis Journal of Software: Evolution and Process10.1002/smr.2647Online publication date: 14-Feb-2024
  • (2023)“Fixing” the Specification of WideningsChallenges of Software Verification10.1007/978-981-19-9601-6_4(57-76)Online publication date: 22-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Privacy and Security
ACM Transactions on Privacy and Security  Volume 24, Issue 2
May 2021
242 pages
ISSN:2471-2566
EISSN:2471-2574
DOI:10.1145/3446639
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2021
Accepted: 01 September 2020
Revised: 01 June 2020
Received: 01 February 2020
Published in TOPS Volume 24, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Abstract interpretation
  2. dynamic languages
  3. static analysis

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Università degli Studi di Verona

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)10
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Malicious URL Detection: A Novel Framework Leveraging Priority Coefficient and Feature EvaluationIEEE Access10.1109/ACCESS.2024.341233112(85001-85026)Online publication date: 2024
  • (2024) Tarsis : An effective automata‐based abstract domain for string analysis Journal of Software: Evolution and Process10.1002/smr.2647Online publication date: 14-Feb-2024
  • (2023)“Fixing” the Specification of WideningsChallenges of Software Verification10.1007/978-981-19-9601-6_4(57-76)Online publication date: 22-Jul-2023
  • (2023)Following the Obfuscation Trail: Identifying and Exploiting Obfuscation Signatures in Malicious CodeFoundations and Practice of Security10.1007/978-3-031-57537-2_20(321-338)Online publication date: 11-Dec-2023
  • (2023)A Machine Learning Approach for Source Code Similarity via Graph-Focused FeaturesMachine Learning, Optimization, and Data Science10.1007/978-3-031-53969-5_5(53-67)Online publication date: 22-Sep-2023
  • (2023)Unconstrained Variable Oracles for Faster Numeric Static AnalysesStatic Analysis10.1007/978-3-031-44245-2_5(65-83)Online publication date: 22-Oct-2023
  • (2023)Domain Precision in Galois Connection-Less Abstract InterpretationStatic Analysis10.1007/978-3-031-44245-2_19(434-459)Online publication date: 24-Oct-2023
  • (2023)How Fitting is Your Abstract Domain?Static Analysis10.1007/978-3-031-44245-2_14(286-309)Online publication date: 24-Oct-2023
  • (2022)Property-Driven Code Obfuscations Reinterpreting Jones-Optimality in Abstract InterpretationStatic Analysis10.1007/978-3-031-22308-2_12(247-271)Online publication date: 2-Dec-2022
  • (2022)Relational String Abstract DomainsVerification, Model Checking, and Abstract Interpretation10.1007/978-3-030-94583-1_2(20-42)Online publication date: 14-Jan-2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media