Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A Retargetable System-level DBT Hypervisor

Published: 30 May 2020 Publication History

Abstract

System-level Dynamic Binary Translation (DBT) provides the capability to boot an Operating System (OS) and execute programs compiled for an Instruction Set Architecture (ISA) different from that of the host machine. Due to their performance-critical nature, system-level DBT frameworks are typically hand-coded and heavily optimized, both for their guest and host architectures. While this results in good performance of the DBT system, engineering costs for supporting a new architecture or extending an existing architecture are high. In this article, we develop a novel, retargetable DBT hypervisor, which includes guest-specific modules generated from high-level guest machine specifications. Our system simplifies retargeting of the DBT, but it also delivers performance levels in excess of existing manually created DBT solutions. We achieve this by combining offline and online optimizations and exploiting the freedom of a Just-in-time (JIT) compiler operating in a bare-metal environment provided by a Virtual Machine (VM) hypervisor. We evaluate our DBT using both targeted micro-benchmarks as well as standard application benchmarks, and we demonstrate its ability to outperform the de facto standard QEMU DBT system. Our system delivers an average speedup of 2.21× over QEMU across SPEC CPU2006 integer benchmarks running in a full-system Linux OS environment, compiled for the 64-bit ARMv8-A ISA and hosted on an x86-64 platform. For floating-point applications the speedup is even higher, reaching 6.49× on average. We demonstrate that our system-level DBT system significantly reduces the effort required to support a new ISA while delivering outstanding performance.

References

[1]
Rodolfo Azevedo, Sandro Rigo, Marcus Bartholomeu, Guido Araujo, Cristiano Araujo, and Edna Barros. 2005. The ArchC architecture description language and tools. Int. J. Parallel Program. 33, 5 (01 Oct. 2005), 453--484.
[2]
Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. 2000. Dynamo: A transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’00). ACM, New York, NY, 1--12.
[3]
Sorav Bansal and Alex Aiken. 2008. Binary translation using peephole superoptimizers. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). USENIX Association, Berkeley, CA, 177--192. Retrieved from http://dl.acm.org/citation.cfm?id=1855741.1855754.
[4]
Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the Conference on USENIX Annual Technical Conference (ATEC’05). USENIX Association, Berkeley, CA, 41--41. Retrieved from http://dl.acm.org/citation.cfm?id=1247360.1247401.
[5]
Igor Böhm, Tobias J. K. Edler von Koch, Stephen C. Kyle, Björn Franke, and Nigel Topham. 2011. Generalized just-in-time trace compilation using a parallel task farm in a dynamic binary translator. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 74--85.
[6]
Florian Brandner, Andreas Fellnhofer, Andreas Krall, and David Riegler. 2008. Fast and accurate simulation using the LLVM compiler framework. In Proceedings of the Workshop on Rapid Simulation and Performance Evalution: Methods and Tools (RAPIDO’08).
[7]
Derek Bruening, Timothy Garnett, and Saman Amarasinghe. 2003. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO’03). IEEE Computer Society, Washington, DC, 265--275. Retrieved from http://dl.acm.org/citation.cfm?id=776261.776290.
[8]
Sebastian Buchwald, Andreas Fried, and Sebastian Hack. 2018. Synthesizing an instruction selection rule library from semantic specifications. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’18). ACM, New York, NY, 300--313.
[9]
Z. Cai, A. Liang, Z. Qi, L. Jiang, X. Li, H. Guan, and Y. Chen. 2009. Performance comparison of register allocation algorithms in dynamic binary translation. In Proceedings of the International Conference on Knowledge and Systems Engineering. 113--119.
[10]
Matthew Chapman, Daniel J. Magenheimer, and Parthasarathy Ranganathan. 2007. MagiXen: Combining Binary Translation and Virtualization. Technical Report HPL-2007-77. Enterprise Systems and Software Laboratory, HP Laboratories, Palo Alto, CA.
[11]
Anton Chernoff, Mark Herdeg, Ray Hookway, Chris Reeve, Norman Rubin, Tony Tye, S. Bharadwaj Yadavalli, and John Yates. 1998. FX!32: A profile-directed binary translator. IEEE Micro 18, 2 (Mar. 1998), 56--64.
[12]
Cristina Cifuentes, Brian Lewis, and David Ung. 2002. Walkabout: A Retargetable Dynamic Binary Translation Framework. Technical Report. Sun Microsystems, Inc., Mountain View, CA.
[13]
Robert F. Cmelik and David Keppel. 1993. Shade: A Fast Instruction Set Simulator for Execution Profiling. Technical Report. Sun Microsystems, Inc., Mountain View, CA.
[14]
Emilio G. Cota, Paolo Bonzini, Alex Bennée, and Luca P. Carloni. 2017. Cross-ISA machine emulation for multicores. In Proceedings of the International Symposium on Code Generation and Optimization, (CGO’17), Vijay Janapa Reddi, Aaron Smith, and Lingjia Tang (Eds.). ACM, 210--220. Retrieved from http://dl.acm.org/citation.cfm?id=3049855.
[15]
Amanieu d’Antras, Cosmin Gorgovan, Jim Garside, John Goodacre, and Mikel Luján. 2017. HyperMAMBO-X64: Using virtualization to support high-performance transparent binary translation. In Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’17). ACM, New York, NY, 228--241.
[16]
Amanieu D’Antras, Cosmin Gorgovan, Jim Garside, and Mikel Luján. 2017. Low overhead dynamic binary translation on ARM. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17). ACM, New York, NY, 333--346.
[17]
James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, and Jim Mattson. 2003. The Transmeta Code Morphing software: Using speculation, recovery, and adaptive retranslation to address real-life challenges. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO’03). IEEE Computer Society, Washington, DC, 15--24. Retrieved from http://dl.acm.org/citation.cfm?id=776261.776263.
[18]
Jiun-Hung Ding, Po-Chun Chang, Wei-Chung Hsu, and Yeh-Ching Chung. 2011. PQEMU: A parallel system emulator based on QEMU. In Proceedings of the IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS’11). IEEE Computer Society, Washington, DC, 276--283.
[19]
Kemal Ebcioğlu and Erik R. Altman. 1997. DAISY: Dynamic compilation for 100% architectural compatibility. In Proceedings of the 24th International Symposium on Computer Architecture (ISCA’97). ACM, New York, NY, 26--37.
[20]
Byron Hawkins, Brian Demsky, Derek Bruening, and Qin Zhao. 2015. Optimizing binary translation of dynamically generated code. In Proceedings of the 13th IEEE/ACM International Symposium on Code Generation and Optimization (CGO’15). IEEE Computer Society, Washington, DC, 68--78. Retrieved from http://dl.acm.org/citation.cfm?id=2738600.2738610.
[21]
Ding-Yong Hong, Chun-Chen Hsu, Pen-Chung Yew, Jan-Jan Wu, Wei-Chung Hsu, Pangfeng Liu, Chien-Min Wang, and Yeh-Ching Chung. 2012. HQEMU: A multi-threaded and retargetable dynamic binary translator on multicores. In Proceedings of the 10th International Symposium on Code Generation and Optimization (CGO’12). ACM, New York, NY, 104--113.
[22]
Ding-Yong Hong, Yu-Ping Liu, Sheng-Yu Fu, Jan-Jan Wu, and Wei-Chung Hsu. 2018. Improving SIMD parallelism via dynamic binary translation. ACM Trans. Embed. Comput. Syst. 17, 3 (Feb. 2018), 61:1–61:27.
[23]
Intel. 2018. Intel XED. Retrieved from https://intelxed.github.io/.
[24]
Daniel Jones and Nigel Topham. 2009. High speed CPU simulation using LTU dynamic binary translation. In Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers (HiPEAC’09). Springer-Verlag, Berlin, 50--64.
[25]
Piyus Kedia and Sorav Bansal. 2013. Fast dynamic binary translation for the kernel. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, NY, 101--115.
[26]
Paul Knowles. 2008. Transitive and QuickTransit Overview. Retrieved from https://www.linux-kvm.org/images/9/98/KvmForum2008%24kdf2008_2.pdf.
[27]
Rajeev Krishna and Todd Austin. 2001. Efficient software decoder design. Tech. Commit. Comput. Archit. Newslett. (Oct. 2001).
[28]
Jianhui Li, Qi Zhang, Shu Xu, and Bo Huang. 2006. Optimizing dynamic binary translation for SIMD instructions. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’06). IEEE Computer Society, Washington, DC, 269--280.
[29]
D. Lockhart, B. Ilbeyi, and C. Batten. 2015. Pydgin: Generating fast instruction set simulators from simple architecture descriptions with meta-tracing JIT compilers. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’15). 256--267.
[30]
Ryan W. Moore, José A. Baiocchi, Bruce R. Childers, Jack W. Davidson, and Jason D. Hiser. 2009. Addressing the challenges of DBT for the ARM architecture. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’09). ACM, New York, NY, 147--156.
[31]
Guilherme Ottoni, Thomas Hartin, Christopher Weaver, Jason Brandt, Belliappa Kuttanna, and Hong Wang. 2011. Harmonia: A transparent, efficient, and harmonious dynamic binary translator targeting the Intel architecture. In Proceedings of the 8th ACM International Conference on Computing Frontiers (CF’11). ACM, New York, NY, 26:1–26:10.
[32]
M. Probst, A. Krall, and B. Scholz. 2002. Register liveness analysis for optimizing dynamic binary translation. In Proceedings of the 9th Working Conference on Reverse Engineering. 35--44.
[33]
S. Rokicki, E. Rohou, and S. Derrien. 2017. Hardware-accelerated dynamic binary translation. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’17). 1062--1067.
[34]
Kevin Scott and Jack Davidson. 2001. Strata: A Software Dynamic Translation Infrastructure. Technical Report. University of Virginia, Charlottesville, VA.
[35]
K. Scott, N. Kumar, S. Velusamy, B. Childers, J. W. Davidson, and M. L. Soffa. 2003. Retargetable and reconfigurable software dynamic translation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO’03). IEEE Computer Society, Washington, DC, 36--47. Retrieved from http://dl.acm.org/citation.cfm?id=776261.776265.
[36]
K. Shigenobu, K. Ootsu, T. Ohkawa, and T. Yokota. 2018. A translation method of ARM machine code to LLVM-IR for binary code parallelization and optimization. In Proceedings of the 5th International Symposium on Computing and Networking (CANDAR’18), Vol. 00. 575--579.
[37]
R. A. Sokolov and A. V. Ermolovich. 2012. Background optimization in full system binary translation. Program. Comput. Softw. 38, 3 (01 June 2012), 119--126.
[38]
Maxwell Souza, Daniel Nicácio, and Guido Araújo. 2012. ISAMAP: Instruction mapping driven by dynamic binary translation. In Proceedings of the International Conference on Computer Architecture (ISCA’10). Springer-Verlag, Berlin, 117--138.
[39]
Tom Spink, Harry Wagstaff, and Björn Franke. 2016. Hardware-accelerated cross-architecture full-system virtualization. ACM Trans. Archit. Code Optim. 13, 4 (Oct. 2016), 36:1–36:25.
[40]
Tom Spink, Harry Wagstaff, Björn Franke, and Nigel P. Topham. 2015. Efficient dual-ISA support in a retargetable, asynchronous dynamic binary translator. In Proceedings of the International Conference/Workshop on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS’15). 103--112.
[41]
Michael Spreitzenbarth, Thomas Schreck, Florian Echtler, Daniel Arp, and Johannes Hoffmann. 2015. Mobile-sandbox: Combining static and dynamic analysis with machine-learning techniques. Int. J. Inf. Secur. 14, 2 (Apr. 2015), 141--153.
[42]
Amitabh Srivastava, Andrew Edwards, and Hoi Vo. 2001. Vulcan: Binary Transformation in a Distributed Environment. Technical Report. Microsoft Research. 12 pages. Retrieved from https://www.microsoft.com/en-us/research/publication/vulcan-binary-transformation-in-a-distributed-environment/.
[43]
Henrik Theiling. 2001. Generating decision trees for decoding binaries. In Proceedings of the ACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems (OM’01). ACM, New York, NY, 112--120.
[44]
Jens Tröger. 2005. Specification-driven Dynamic Binary Translation. Ph.D. Dissertation. Queensland University of Technology. Retrieved from https://eprints.qut.edu.au/16007/.
[45]
David Ung and Cristina Cifuentes. 2000. Machine-adaptable dynamic binary translation. In Proceedings of the ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization (DYNAMO’00). ACM, New York, NY, 41--51.
[46]
H. Wagstaff, B. Bodin, T. Spink, and B. Franke. 2017. SimBench: A portable benchmarking methodology for full-system simulators. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’17). 217--226.
[47]
H. Wagstaff, M. Gould, B. Franke, and N. Topham. 2013. Early partial evaluation in a JIT-compiled, retargetable instruction set simulator generated from a high-level architecture description. In Proceedings of the 50th ACM/EDAC/IEEE Design Automation Conference (DAC’13). 1--6.
[48]
Cheng Wang, Shiliang Hu, Ho-seop Kim, Sreekumar R. Nair, Mauricio Breternitz, Zhiwei Ying, and Youfeng Wu. 2007. StarDBT: An efficient multi-platform dynamic binary translation system. In Proceedings of the 12th Asia-Pacific Conference on Advances in Computer Systems Architecture (ACSAC’07). Springer-Verlag, Berlin, 4--15. Retrieved from http://dl.acm.org/citation.cfm?id=2392163.2392166.
[49]
Wenwen Wang, Stephen McCamant, Antonia Zhai, and Pen-Chung Yew. 2018. Enhancing cross-ISA DBT through automatically learned translation rules. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18). ACM, New York, NY, 84--97.
[50]
Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Stephen McCamant. 2016. A general persistent code caching framework for dynamic binary translation (DBT). In Proceedings of the Usenix Annual Technical Conference (USENIX ATC’16). USENIX Association, Berkeley, CA, 591--603. Retrieved from http://dl.acm.org/citation.cfm?id=3026959.3027013.
[51]
Zhe Wang, Jianjun Li, Chenggang Wu, Dongyan Yang, Zhenjiang Wang, Wei-Chung Hsu, Bin Li, and Yong Guan. 2015. HSPT: Practical implementation and efficient management of embedded shadow page tables for cross-ISA system virtual machines. ACM SIGPLAN Not., Vol. 50. ACM, 53--64.
[52]
Tom Warren. 2015. Microsoft built an Xbox 360 emulator to make games run on the Xbox One. Retrieved from https://www.theverge.com/2015/6/15/8785955/microsoft-xbox-one-xbox-360-emulator-software.
[53]
Emmett Witchel and Mendel Rosenblum. 1996. Embra: Fast and flexible machine simulation. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’96). ACM, New York, NY, 68--79.
[54]
Chaohao Xu, Jianhui Li, Tao Bao, Yun Wang, and Bo Huang. 2007. Metadata driven memory optimizations in dynamic binary translator. In Proceedings of the 3rd International Conference on Virtual Execution Environments (VEE’07). ACM, New York, NY, 148--157.
[55]
Xiaochun Zhang, Qi Guo, Yunji Chen, Tianshi Chen, and Weiwu Hu. 2015. HERMES: A fast cross-ISA binary translator with post-optimization. In Proceedings of the 13th IEEE/ACM International Symposium on Code Generation and Optimization (CGO’15). IEEE Computer Society, Washington, DC, 246--256. Retrieved from http://dl.acm.org/citation.cfm?id=2738600.2738631.

Cited By

View all
  • (2022)Eliminate the overhead of interrupt checking in full-system dynamic binary translatorProceedings of the 15th ACM International Conference on Systems and Storage10.1145/3534056.3534939(1-12)Online publication date: 6-Jun-2022
  • (2022)CrossDBT: An LLVM-Based User-Level Dynamic Binary Translation EmulatorEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_1(3-18)Online publication date: 22-Aug-2022
  • (2020)More with Less – Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00043(415-426)Online publication date: Oct-2020

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 36, Issue 4
Section: Best of ATC 2019 and Regular Paper
November 2018
115 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/3394910
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2020
Online AM: 07 May 2020
Accepted: 01 February 2020
Received: 01 November 2019
Published in TOCS Volume 36, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Virtualization
  2. dynamic binary translation
  3. hypervisor

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Eliminate the overhead of interrupt checking in full-system dynamic binary translatorProceedings of the 15th ACM International Conference on Systems and Storage10.1145/3534056.3534939(1-12)Online publication date: 6-Jun-2022
  • (2022)CrossDBT: An LLVM-Based User-Level Dynamic Binary Translation EmulatorEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_1(3-18)Online publication date: 22-Aug-2022
  • (2020)More with Less – Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00043(415-426)Online publication date: Oct-2020

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media