Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3617232.3624862acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Lightweight, Modular Verification for WebAssembly-to-Native Instruction Selection

Published: 17 April 2024 Publication History

Abstract

Language-level guarantees---like module runtime isolation for WebAssembly (Wasm)---are only as strong as the compiler that produces a final, native-machine-specific executable. The process of lowering language-level constructions to ISA-specific instructions can introduce subtle bugs that violate security guarantees. In this paper, we present Crocus, a system for lightweight, modular verification of instruction-lowering rules within Cranelift, a production retargetable Wasm native code generator. We use Crocus to verify lowering rules that cover WebAssembly 1.0 support for integer operations in the ARM aarch64 backend. We show that Crocus can reproduce 3 known bugs (including a 9.9/10 severity CVE), identify 2 previously-unknown bugs and an underspecified compiler invariant, and help analyze the root causes of a new bug.

References

[1]
Alessandro Abate, Cristina David, Pascal Kesseli, Daniel Kroening, and Elizabeth Polgreen. Counterexample guided inductive synthesis modulo theories. In International Conference on Computer-Aided Verification (CAV), 2018.
[2]
Maaz Bin Safeer Ahmad, Alexander J. Root, Andrew Adams, Shoaib Kamil, and Alvin Cheung. Vector instruction selection for digital signal processors using program synthesis. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.
[3]
Bytecode Alliance. ISLE language reference. https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/isle/docs/language-reference.md, 2023.
[4]
Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Alastair Reid, Kathryn E. Gray, Robert M. Norton, Prashanth Mundkur, Mark Wassell, Jon French, Christopher Pulte, Shaked Flur, Ian Stark, Neel Krishnaswami, and Peter Sewell. ISA semantics for ARMv8-A, RISC-V, and CHERI-MIPS. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2019.
[5]
Alasdair Armstrong, Brian Campbell, Ben Simner, Christopher Pulte, and Peter Sewell. Isla: Integrating full-scale ISA semantics and axiomatic concurrency models. In International Conference on Computer-Aided Verification (CAV), 2021.
[6]
Javier Cabrera Arteaga, Nicholas Fitzgerald, Martin Monperrus, and Benoit Baudry. Wasm-mutate: Fuzzing WebAssembly compilers with e-graphs. In E-Graph Research, Applications, Practices, and Human-factors Symposium, 2022. URL: https://www.jacarte.me/assets/pdf/wasm_mutate.pdf.
[7]
Vytautas Astrauskas, Peter Müller, Federico Poli, and Alexander J. Summers. Leveraging Rust types for modular specification and verification. In ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA), 2019.
[8]
Marek Baranowski, Shaobo He, and Zvonimir Rakamaric. Verifying Rust programs with SMACK. 2018.
[9]
Mike Barnett, K. Rustan M. Leino, and Wolfram Schulte. The spec# programming system: An overview. In Construction and Analysis of Safe, Secure, and Interoperable Smart Devices, 2005.
[10]
Björn Roy Baron et al. Cranelift codegen backend for Rust, 2023. URL: https://github.com/bjorn3/rustc_codegen_cranelift.
[11]
Clark W. Barrett, Aaron Stump, and Cesare Tinelli. The SMT-LIB standard version 2.0. In Proceedings of the 8th International Workshop on Satisfiability Modulo Theories (SMT), 2010. URL: https://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.0-r10.12.21.pdf.
[12]
Iulia Bastys, Maximilian Algehed, Alexander Sjösten, and Andrei Sabelfeld. Secwasm: Information flow control for WebAssembly. In Static Analysis, 2022.
[13]
Yves Bertot and Pierre Castéran. Interactive theorem proving and program development: Coq'Art: the calculus of inductive constructions. Springer Science & Business Media, 2013.
[14]
Jay Bosamiya, Wen Shih Lim, and Bryan Parno. Provably-safe multilingual software sandboxing using WebAssembly. In USENIX Security Symposium, 2022. URL: https://www.usenix.org/conference/usenixsecurity22/presentation/bosamiya.
[15]
Fraser Brown, John Renner, Andres Nötzli, Sorin Lerner, Hovav Shacham, and Deian Stefan. Towards a verified range analysis for JavaScript JITs. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2020.
[16]
Sebastian Buchwald, Andreas Fried, and Sebastian Hack. Synthesizing an instruction selection rule library from semantic specifications. In ACM/IEEE International Symposium on Code Generation and Optimization (CGO), 2018.
[17]
Bytecode Alliance. The Cranelift compiler. https://github.com/bytecodealliance/wasmtime/tree/main/cranelift, 2023.
[18]
Bytecode Alliance. Wasmtime: A fast and secure runtime for WebAssembly. https://wasmtime.dev, 2023.
[19]
R. G. Cattell. Automatic derivation of code generators from machine descriptions. ACM Transactions on Programming Languages and Systems (TOPLAS), 1980.
[20]
R G G Cattell. Formalization and Automatic Derivation of Code Generators. PhD thesis, Carnegie Mellon University, 1978. https://apps.dtic.mil/sti/pdfs/ADA058872.pdf.
[21]
J. Ceng, M. Hohenauer, R. Leupers, G. Ascheid, H. Meyr, and G. Braun. C compiler retargeting based on instruction semantics models. In Design, Automation & Test in Europe (DATE), 2005.
[22]
Alex Crichton. Data leakage between instances in the pooling allocator. https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-wh6w-3828-g9qf, November 2022.
[23]
Alex Crichton. Miscompilation of constant values in division on aarch64. https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4, July 2022.
[24]
Alex Crichton. Miscompilation of 'i8x16.swizzle' and 'select' with v128 inputs. https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-jqwc-c49r-4w2x, 2022.
[25]
Alex Crichton. Guest-controlled out-of-bounds read/write on x8664. https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-ff4p-7xrq-q5r8, 2023.
[26]
Ross Daly, Caleb Donovick, Jackson Melchert, Rajsekhar Setaluri, Nestan Tsiskaridze Bullock, Priyanka Raina, Clark Barrett, and Pat Hanrahan. Synthesizing instruction selection rewrite rules from RTL using SMT. In Formal Methods in Computer-Aided Design (FMCAD), 2022.
[27]
Sandeep Dasgupta, Daejun Park, Theodoros Kasampalis, Vikram S. Adve, and Grigore Roşu. A complete formal semantics of x86-64 user-level instruction set architecture. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2019.
[28]
Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2008.
[29]
Nachum Dershowitz and Jean-Pierre Jouannaud. Rewrite systems. In Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics, 1991.
[30]
João Dias and Norman Ramsey. Automatically generating instruction selectors using declarative machine descriptions. 2010.
[31]
Chris Fallin. Memory access due to code generation flaw in Cranelift module. https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-hpqh-2wqx-7qp5, May 2021.
[32]
Chris Fallin. RFC: Design of the ISLE instruction-selector DSL. https://github.com/bytecodealliance/rfcs/pull/15, August 2021.
[33]
Chris Fallin. Cranelift's instruction selector DSL, ISLE: Term-rewriting made practical. https://cfallin.org/blog/2023/01/20/cranelift-isle/, January 2023.
[34]
Anthony Fox, Magnus O Myreen, Yong Kiam Tan, and Ramana Kumar. Verified compilation of CakeML to multiple machine-code targets. 2017.
[35]
Go Authors. Go compiler backend lowering rules. https://github.com/golang/go/tree/master/src/cmd/compile/internal/ssa/_gen, 2023.
[36]
Andreas Haas, Andreas Rossberg, Derek L Schuff, Ben L Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with WebAssembly. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2017.
[37]
Rikard Hjort. Formally verifying WebAssembly with KWasm, 2020. URL: https://odr.chalmers.se/server/api/core/bitstreams/a06be182-a12e-46ce-94d3-cff7a5dc42ba/content.
[38]
C. A. R. Hoare. An axiomatic basis for computer programming. In Communications of the ACM (CACM), 1969.
[39]
Roger Hoover and Kenneth Zadeck. Generating machine specific optimizing compilers. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 1996.
[40]
Susmit Jha, Rhishikesh Limaye, and Sanjit A. Seshia. Beaver: Engineering an efficient SMT solver for bit-vector arithmetic. In Computer Aided Verification, 2009.
[41]
Evan Johnson, Evan Laufer, Zijie Zhao, Dan Gohman, Shravan Narayan, Stefan Savage, Deian Stefan, and Fraser Brown. WaVe: a verifiably secure WebAssembly sandboxing runtime. In IEEE Security and Privacy (Oakland), 2023.
[42]
Evan Johnson, David Thien, Yousef Alhessi, Shravan Narayan, Fraser Brown, Sorin Lerner, Tyler McMullen, Stefan Savage, and Deian Stefan. Trust but verify: SFI safety for native-compiled Wasm. 2021. URL: https://cseweb.ucsd.edu/~lerner/papers/wasm-sfi-ndss2021.pdf.
[43]
Kenton Varda. WebAssembly on Cloudflare workers. https://blog.cloudflare.com/webassembly-on-cloudflare-workers/, 2018.
[44]
Ramana Kumar, Magnus O. Myreen, Michael Norrish, and Scott Owens. CakeML: A verified implementation of ML. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2014.
[45]
Sudipta Kundu, Zachary Tatlock, and Sorin Lerner. Proving optimizations correct using parameterized program equivalence. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2009.
[46]
Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In ACM/IEEE International Symposium on Code Generation and Optimization (CGO), 2004.
[47]
Sorin Lerner, Todd Millstein, and Craig Chambers. Automatically proving the correctness of compiler optimizations. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2003.
[48]
Sorin Lerner, Todd Millstein, Erika Rice, and Craig Chambers. Automated soundness proofs for dataflow analyses and transformations via local rules. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2005.
[49]
Xavier Leroy. Formal verification of a realistic compiler. Communications of the ACM (CACM), 52(7):107--115, 2009.
[50]
Xavier Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363--446, 2009.
[51]
Nuno P Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. Alive2: bounded translation validation for LLVM. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2021.
[52]
Nuno P Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. Provably correct peephole optimizations with Alive. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2015.
[53]
Nuno P. Lopes and John Regehr. Future directions for optimizing compilers. 2018. URL: https://arxiv.org/pdf/1809.02161.pdf.
[54]
Alberto Martelli and Ugo Montanari. An efficient unification algorithm. In ACM Transactions on Programming Languages and Systems (TOPLAS), 1982.
[55]
Charith Mendis and Saman Amarasinghe. GoSLP: Globally optimized superword level parallelism framework. In ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA), 2018.
[56]
Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik. Chaff: Engineering an efficient SAT solver. In Proceedings of the 38th Annual Design Automation Conference, 2001.
[57]
Manasij Mukherjee, Pranav Kant, Zhengyang Liu, and John Regehr. Dataflow-based pruning for speeding up superoptimization. In ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA), 2020.
[58]
Joshua Nelson. Using rustc_codegen_cranelift for debug builds. https://blog.rust-lang.org/inside-rust/2020/11/15/Using-rustc_codegen_cranelift.html, November 2020.
[59]
Luke Nelson, Jacob Van Geffen, Emina Torlak, and Xi Wang. Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020.
[60]
Aina Niemetz, Mathias Preiner, Andrew Reynolds, Yoni Zohar, Clark Barrett, and Cesare Tinelli. Towards bit-width-independent proofs in SMT solvers. In International Conference on Automated Deduction (CADE), 2019.
[61]
Pat Hickey. Lucet takes WebAssembly beyond the browser | Fastly. https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime, 2019.
[62]
Jonathan Protzenko, Benjamin Beurdouche, Denis Merigoux, and Karthikeyan Bhargavan. Formally verified cryptographic web applications in WebAssembly. In IEEE Symposium on Security and Privacy (SP), 2019.
[63]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman P. Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2013.
[64]
Alastair Reid, Luke Church, Shaked Flur, Sarah de Haas, Maritza Johnson, and Ben Laurie. Towards making formal methods normal: meeting developers where they are. 2020.
[65]
Gang Ren, Peng Wu, and David Padua. Optimizing data permutations for SIMD devices. page 118--131, 2006.
[66]
Andreas Rossberg. WebAssembly Specification Release 1.0. https://webassembly.github.io/JS-BigInt-integration/core/_download/WebAssembly.pdf, 2019.
[67]
Andreas Rossberg. WebAssembly Specification Release 2.0 Draft Draft 2023-04-08. https://webassembly.github.io/spec/core/_download/WebAssembly.pdf, 2023.
[68]
Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W. Appel. Compositional CompCert. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2015.
[69]
Yong Kiam Tan, Magnus O Myreen, Ramana Kumar, Anthony Fox, Scott Owens, and Michael Norrish. The verified CakeML compiler backend. Journal of Functional Programming, 29, 2019. URL: https://cakeml.org/jfp19.pdf.
[70]
Alexa VanHattum, Rachit Nigam, Vincent T. Lee, James Bornholt, and Adrian Sampson. Vectorization for digital signal processors via equality saturation. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021.
[71]
Vercel Inc. Using WebAssembly (Wasm) at the edge. https://vercel.com/docs/concepts/functions/edge-functions/wasm, 2023.
[72]
Eelco Visser, Zine-el-Abidine Benaissa, and Andrew Tolmach. Building program optimizers with rewriting strategies. In ACM International Conference on Functional Programming (ICFP), 1998.
[73]
Conrad Watt. Mechanising and verifying the WebAssembly specification. 2018.
[74]
Conrad Watt, John Renner, Natalie Popescu, Sunjay Cauligi, and Deian Stefan. CT-Wasm: Type-driven secure cryptography for the web ecosystem. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2019.
[75]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. Finding and understanding bugs in C compilers. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2011.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1
April 2024
494 pages
ISBN:9798400703720
DOI:10.1145/3617232
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2024

Check for updates

Badges

Author Tags

  1. instruction selection
  2. source code generation
  3. compiler verification
  4. webassembly
  5. sandboxing

Qualifiers

  • Research-article

Funding Sources

Conference

ASPLOS '24

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 451
    Total Downloads
  • Downloads (Last 12 months)451
  • Downloads (Last 6 weeks)62
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media