Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3442381.3450138acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

An Empirical Study of Real-World WebAssembly Binaries: Security, Languages, Use Cases

Published: 03 June 2021 Publication History

Abstract

WebAssembly has emerged as a low-level language for the web and beyond. Despite its popularity in different domains, little is known about WebAssembly binaries that occur in the wild. This paper presents a comprehensive empirical study of 8,461 unique WebAssembly binaries gathered from a wide range of sources, including source code repositories, package managers, and live websites. We study the security properties, source languages, and use cases of the binaries and how they influence the security of the WebAssembly ecosystem. Our findings update some previously held assumptions about real-world WebAssembly and highlight problems that call for future research. For example, we show that vulnerabilities that propagate from insecure source languages potentially affect a wide range of binaries (e.g., two thirds of the binaries are compiled from memory unsafe languages, such as C and C++) and that 21% of all binaries import potentially dangerous APIs from their host environment. We also show that cryptomining, which once accounted for the majority of all WebAssembly code, has been marginalized (less than 1% of all binaries found on the web) and gives way to a diverse set of use cases. Finally, 29% of all binaries on the web are minified, calling for techniques to decompile and reverse engineer WebAssembly. Overall, our results show that WebAssembly has left its infancy and is growing up into a language that powers a diverse ecosystem, with new challenges and opportunities for security researchers and practitioners. Besides these insights, we also share the dataset underlying our study, which is 58 times larger than the largest previously reported benchmark.

References

[1]
Georgi Geshev Alex Plaskett, Fabian Beterke. 2018. Apple Safari – Wasm Section Exploit.
[2]
Javier Cabrera Arteaga, Orestis Floros Malivitsis, Oscar Luis Vera Pérez, Benoit Baudry, and Martin Monperrus. 2020. CROW: Code Diversification for WebAssembly. arXiv preprint arXiv:2008.07185(2020).
[3]
John Bergbom. 2018. Memory safety: old vulnerabilities become new with WebAssembly.
[4]
Javier Cabrera Arteaga, Shrinish Donde, Jian Gu, Orestis Floros, Lucas Satabin, Benoit Baudry, and Martin Monperrus. 2020. Superoptimization of WebAssembly bytecode. In ICPS Companion 2020. 36–40.
[5]
Frank Denis. 2018. WebAssembly doesn’t make unsafe languages safe (yet).
[6]
Craig Disselkoen, John Renner, Conrad Watt, Tal Garfinkel, Amit Levy, and Deian Stefan. 2019. Position Paper: Progressive Memory Safety for WebAssembly. In HASP.
[7]
Ana Nora Evans, Bradford Campbell, and Mary Lou Soffa. 2020. Is Rust used Safely by Software Developers?. In ICSE. 246–257.
[8]
William Fu, Raymond Lin, and Daniel Inge. 2018. TaintAssembly: Taint-Based Information Flow Control Tracking for WebAssembly. CoRR abs/1802.01050(2018).
[9]
Daniel Genkin, Lev Pachmanov, Eran Tromer, and Yuval Yarom. 2018. Drive-By Key-Extraction Cache Attacks from Portable Code. In ACNS.
[10]
Andreas Haas, Andreas Rossberg, Derek L Schuff, Ben L Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. 2017. Bringing the web up to speed with WebAssembly. In PLDI.
[11]
David Herrera, Hanfeng Chen, Erick Lavoie, and Laurie Hendren. 2018. Numerical computing on the web: benchmarking for the future. In DLS. ACM, 88–100.
[12]
Abhinav Jangda, Bobby Powers, Emery D. Berger, and Arjun Guha. 2019. Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code. In 2019 USENIX ATC. 107–120.
[13]
Faiz Khan, Vincent Foley-Bourgon, Sujay Kathrotia, Erick Lavoie, and Laurie J. Hendren. 2014. Using JavaScript and WebCL for numerical computations: a comparative study of native and web technologies. In DLS’14. ACM, 91–102.
[14]
Amin Kharraz, Zane Ma, Paul Murley, Charles Lever, Joshua Mason, Andrew Miller, Nikita Borisov, Manos Antonakakis, and Michael Bailey. 2019. Outguard: Detecting in-browser covert cryptocurrency mining in the wild. In WWW ’19.
[15]
Radhesh Krishnan Konoth, Emanuele Vineti, Veelasha Moonsamy, Martina Lindorfer, Christopher Kruegel, Herbert Bos, and Giovanni Vigna. 2018. MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense. In CCS 2018.
[16]
Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo Wilson, and Engin Kirda. 2017. Thou Shalt Not Depend on Me: Analysing the Use of Outdated JavaScript Libraries on the Web. In NDSS 2017.
[17]
Dawn Lawrie, Christopher Morrell, Henry Feild, and David Binkley. 2006. What’s in a Name? A Study of Identifiers. In ICPC. 3–12.
[18]
Daniel Lehmann, Johannes Kinder, and Michael Pradel. 2020. Everything Old is New Again: Binary Security of WebAssembly(USENIX Security 2020). 217–234.
[19]
Daniel Lehmann and Michael Pradel. 2019. Wasabi: A framework for dynamically analyzing webassembly(ASPLOS 2019). 1045–1058.
[20]
Giorgi Maisuradze and Christian Rossow. 2018. Ret2spec: Speculative Execution Using Return Stack Buffers. In CCS.
[21]
Christopher D Manning, Hinrich Schütze, and Prabhakar Raghavan. 2008. Introduction to information retrieval. Cambridge university press.
[22]
Brian McFadden, Tyler Lukasiewicz, Jeff Dileo, and Justin Engler. 2018. Security Chasms of WASM. NCC Group Whitepaper.
[23]
William Melicher, Anupam Das, Mahmood Sharif, Lujo Bauer, and Limin Jia. 2018. Riding out DOMsday: Towards Detecting and Preventing DOM Cross-Site Scripting. Network and Distributed System Security Symposium (NDSS).
[24]
Marius Musch, Christian Wressnegger, Martin Johns, and Konrad Rieck. 2019. New Kid on the Web: A Study on the Prevalence of WebAssembly in the Wild(DIMVA 2019). Springer, 23–42.
[25]
Marius Musch, Christian Wressnegger, Martin Johns, and Konrad Rieck. 2019. Thieves in the Browser: Web-based Cryptojacking in the Wild. In ARES.
[26]
Shravan Narayan, Craig Disselkoen, Tal Garfinkel, Nathan Froyd, Eric Rahm, Sorin Lerner, Hovav Shacham, and Deian Stefan. 2020. Retrofitting Fine Grain Isolation in the Firefox Renderer. In USENIX Security.
[27]
Nick Nikiforakis, Luca Invernizzi, Alexandros Kapravelos, Steven Van Acker, Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. 2012. You are what you include: large-scale evaluation of remote JavaScript inclusions. In CCS. 736–747.
[28]
Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński, and Wouter Joosen. 2018. Tranco: A research-oriented top sites ranking hardened against manipulation. arXiv preprint arXiv:1802.01156(2018).
[29]
Michael Pradel and Koushik Sen. 2015. The Good, the Bad, and the Ugly: An Empirical Study of Implicit Type Conversions in JavaScript. In ECOOP.
[30]
J. Protzenko, B. Beurdouche, D. Merigoux, and K. Bhargavan. 2019. Formally Verified Cryptographic Web Applications in WebAssembly. In SP.
[31]
Gregor Richards, Andreas Gal, Brendan Eich, and Jan Vitek. 2011. Automated construction of JavaScript benchmarks. In OOPSLA. 677–694.
[32]
Andreas Rossberg. 2019. Multiple per-module memories for Wasm.
[33]
Andreas Rossberg. 2019. Proposal for adding basic reference types.
[34]
Jan Rüth, Torsten Zimmermann, Konrad Wolsing, and Oliver Hohlfeld. 2018. Digging into Browser-based Crypto Mining. In IMC.
[35]
Marija Selakovic and Michael Pradel. 2016. Performance Issues and Optimizations in JavaScript: An Empirical Study. In ICSE.
[36]
Natalie Silvanovich. 2018. The Problems and Promise of WebAssembly.
[37]
Philippe Skolka, Cristian-Alexandru Staicu, and Michael Pradel. 2019. Anything to Hide? Studying Minified and Obfuscated Code in the Web. In WWW.
[38]
Sooel Son and Vitaly Shmatikov. 2013. The Postman Always Rings Twice: Attacking and Defending postMessage in HTML5 Websites. In NDSS.
[39]
Cristian-Alexandru Staicu and Michael Pradel. 2018. Freezing the Web: A Study of ReDoS Vulnerabilities in JavaScript-based Web Servers. In USENIX Sec.
[40]
Aron Szanto, Timothy Tamm, and Artidoro Pagnoni. 2018. Taint Tracking for WebAssembly. https://arxiv.org/abs/1807.08349. arxiv:1807.08349 [cs.CR]
[41]
Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. SoK: Eternal War in Memory. In 2013 IEEE Symposium on Security and Privacy (SP 2013).
[42]
Said Varlioglu, Bilal Gonen, Murat Ozer, and Mehmet Bastug. 2020. Is Cryptojacking Dead after Coinhive Shutdown?. In ICICT. IEEE, 385–389.
[43]
Wenhao Wang, Benjamin Ferrell, Xiaoyang Xu, Kevin W Hamlen, and Shuang Hao. 2018. Seismic: Secure in-lined script monitors for interrupting cryptojacks. In European Symposium on Research in Computer Security. Springer, 122–142.
[44]
Conrad Watt. 2018. Mechanising and verifying the WebAssembly specification. In CPP 2018. 53–65.
[45]
Conrad Watt, John Renner, Natalie Popescu, Sunjay Cauligi, and Deian Stefan. 2019. CT-Wasm: Type-Driven Secure Cryptography for the Web Ecosystem. POPL (2019).
[46]
Hui Xu, Zhuangbin Chen, Mingshen Sun, and Yangfan Zhou. 2020. Memory-Safety Challenge Considered Solved? An Empirical Study with All Rust CVEs. arXiv preprint arXiv:2003.03296(2020).

Cited By

View all
  • (2025)From accuracy to approximation: A survey on approximate homomorphic encryption and its applicationsComputer Science Review10.1016/j.cosrev.2024.10068955(100689)Online publication date: Feb-2025
  • (2024)SoK: Analysis Techniques for WebAssemblyFuture Internet10.3390/fi1603008416:3(84)Online publication date: 29-Feb-2024
  • (2024)JABBERWOCK: A Tool for WebAssembly Dataset Generation and Its Application to Malicious Website DetectionJournal of Information Processing10.2197/ipsjjip.32.29832(298-307)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)357
  • Downloads (Last 6 weeks)25
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)From accuracy to approximation: A survey on approximate homomorphic encryption and its applicationsComputer Science Review10.1016/j.cosrev.2024.10068955(100689)Online publication date: Feb-2025
  • (2024)SoK: Analysis Techniques for WebAssemblyFuture Internet10.3390/fi1603008416:3(84)Online publication date: 29-Feb-2024
  • (2024)JABBERWOCK: A Tool for WebAssembly Dataset Generation and Its Application to Malicious Website DetectionJournal of Information Processing10.2197/ipsjjip.32.29832(298-307)Online publication date: 2024
  • (2024)WaDec: Decompiling WebAssembly Using Large Language ModelProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695020(481-492)Online publication date: 27-Oct-2024
  • (2024)Wasm-R3: Record-Reduce-Replay for Realistic and Standalone WebAssembly BenchmarksProceedings of the ACM on Programming Languages10.1145/36897878:OOPSLA2(2156-2182)Online publication date: 8-Oct-2024
  • (2024)Wapplique: Testing WebAssembly Runtime via Execution Context-Aware Bytecode MutationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680340(1035-1047)Online publication date: 11-Sep-2024
  • (2024)Multi-modal Learning for WebAssembly Reverse EngineeringProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652141(453-465)Online publication date: 11-Sep-2024
  • (2024)eWAPA: An eBPF-based WASI Performance Analysis Framework for Web Assembly Runtimes2024 IEEE International Conference on Software Services Engineering (SSE)10.1109/SSE62657.2024.00054(323-333)Online publication date: 7-Jul-2024
  • (2024)WASMDYPA: Effectively Detecting WebAssembly Bugs via Dynamic Program Analysis2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00037(296-307)Online publication date: 12-Mar-2024
  • (2024)A Cross-Architecture Evaluation of WebAssembly in the Cloud-Edge Continuum2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00046(337-346)Online publication date: 6-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media