Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Evaluating indirect branch handling mechanisms in software dynamic translation systems

Published: 22 June 2011 Publication History

Abstract

Software Dynamic Translation (SDT) is used for instrumentation, optimization, security, and many other uses. A major source of SDT overhead is the execution of code to translate an indirect branch's target address into the translated destination block's address.
This article discusses sources of Indirect Branch (IB) overhead in SDT systems and evaluates techniques for overhead reduction. Measurements using SPEC CPU2000 show that the appropriate choice and configuration of IB translation mechanisms can significantly reduce the overhead. Further, cross-architecture evaluation of these mechanisms reveals that the most efficient implementation and configuration can be highly dependent on the architecture implementation.

References

[1]
Advanced Micro Devices. 2006. AMD website on Opterons. http://www.amd.com/us-en/Processors/ProductInformation/0,30_118_8826,0%0.html.
[2]
Apple Computers. 2006. Apple website on Rosetta. http://www.apple.com/rosetta/.
[3]
Bala, V., Duesterwald, E., and Banerjia, S. 2000. Dynamo: A transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'00). ACM Press, New York, 1--12.
[4]
Baraz, L., Devor, T., Etzion, O., Goldenberg, S., Skaletsky, A., Wang, Y., and Zemach, Y. 2003. IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 191.
[5]
Bruening, D. 2004. Efficient, transparent, and comprehensive runtime code manipulation. Ph.D. thesis, MIT.
[6]
Bruening, D. and Amarasinghe, S. 2005. Maintaining consistency and bounding capacity of software code caches. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'05). IEEE Computer Society, Los Alamitos, CA, 74--85.
[7]
Bruening, D., Garnett, T., and Amarasinghe, S. 2003. An infrastructure for adaptive dynamic optimization. In Proceedings of the 1st International Symposium on Code Generation and Optimization. 265--275.
[8]
Chen, W.-K., Lerner, S., Chaiken, R., and Gillies, D. 2000. Mojo: A dynamic optimization system. In Proceedings of the ACM Workshop on Feedback-Directed and Dynamic Optimization (FDDO-3).
[9]
Chernoff, A., Herdeg, M., Hookway, R., Reeve, C., Rubin, N., Tye, T., Yadavalli, S. B., and Yates, J. 1998. FX!32: A profile-directed binary translator. IEEE Micro 18, 2, 56--64.
[10]
Cmelik, B. and Keppel, D. 1994. Shade: A fast instruction-set simulator for execution profiling. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. ACM Press, New York, 128--137.
[11]
Ditzel, D. R. 2000. Transmeta's Crusoe: Cool chips for mobile computing. In Hot Chips XII. Stanford University. IEEE Computer Society Press.
[12]
Duesterwald, E. and Bala, V. 2000. Software profiling for hot path prediction: Less is more. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX). ACM Press, New York, 202--211.
[13]
Ebcioğlu, K. and Altman, E. 1997. DAISY: Dynamic compilation for 100% architectural compatibility. In Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA'97). ACM Press, New York, 26--37.
[14]
Ebcioğlu, K., Altman, E., Gschwind, M., and Sathaye, S. 2001. Dynamic binary translation and optimization. IEEE Trans. Comput. 50, 6, 529--548.
[15]
Gschwind, M., Altman, E. R., Sathaye, S., Ledak, P., and Appenzeller, D. 2000. Dynamic and transparent binary translation. Comput. 33, 3, 54--59.
[16]
Hazelwood, K. and Klauser, A. 2006. A dynamic binary instrumentation engine for the ARM architecture. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'06). ACM, New York, 261--270.
[17]
Hazelwood, K. and Smith, J. E. 2004. Exploring code cache eviction granularities in dynamic optimization systems. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'04). IEEE Computer Society, Los Alamitos, CA, 89.
[18]
Hazelwood, K. and Smith, M. D. 2003. Generational cache management of code traces in dynamic optimization systems. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'36). IEEE Computer Society, Los Alamitos, CA, 169.
[19]
Hiniker, D., Hazelwood, K., and Smith, M. D. 2005. Improving region selection in dynamic optimization systems. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'38). IEEE Computer Society, Los Alamitos, CA, 141--154.
[20]
Hu, W., Williams, D., Davidson, J. W., Hiser, J. D., Knight, J. C., and Nguyen-Tuong, A. 2009. Security through diversity: Leveraging virtual machine technology. IEEE Secu. Priv. 7, 1, Special Issue on IT Monoculture 26--33.
[21]
Intel 2005. IA-32 Intel Architecture Optimization Reference Manual.
[22]
Kim, H.-S. and Smith, J. E. 2003. Hardware support for control transfers in code caches. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'36). IEEE Computer Society, Los Alamitos, CA, 253.
[23]
Kiriansky, V., Bruening, D., and Amarasinghe, S. 2002. Secure execution via program shepherding. In Proceedings of the 11th USENIX Security Symposium.
[24]
Kumar, N., Bruce R, C., Williams, D., Davidson, J., and Soffa, M. 2005. Compile-Time planning for overhead reduction in software dynamic translators. Int. J. of Parall. Program. 33, 2, 103--114.
[25]
Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V. J., and Hazelwood, K. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05). ACM Press, New York, 190--200.
[26]
Moore, R. W., Baiocchi, J. A., Childers, B. R., Davidson, J. W., and Hiser, J. D. 2009. Addressing the challenges of dbt for the arm architecture. In Proceedings of the ACM Conference on Languages Compilers and Tools for Embedded Systems (LCTES'09).
[27]
Scott, K. and Davidson, J. 2001a. Low-Overhead software dynamic translation. Tech. CS-2001-18. July.
[28]
Scott, K. and Davidson, J. 2001b. Strata: A software dynamic translation infrastructure. In Proceedings of the IEEE Workshop on Binary Translation.
[29]
Scott, K., Kumar, N., Velusamy, S., Childers, B., Davidson, J. W., and Soffa, M. L. 2003. Retargetable and reconfigurable software dynamic translation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'03). IEEE Computer Society, Los Alamitos, CA, 36--47.
[30]
Sedgewick, R. 1983. Algorithms. Addison-Wesley.
[31]
Skadron, K., Ahuja, P., Martonosi, M., and Clark, D. 1998. Improving prediction for procedure returns with return-address-stack repair mechanisms. In Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'31). 259--271.
[32]
Smith, J. and Nair, R. 2005. Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann.
[33]
Sridhar, S., Shapiro, J. S., and Bungale, P. P. 2005. HDTrans: A low-overhead dynamic translator. In Proceedings of the Workshop on Binary Instrumentation and Applications. IEEE Computer Society.
[34]
Standard Performance Evaluation Corporation. SPEC CPU2000 Benchmarks. http://www.specbench.org/osg/cpu2000.
[35]
Sun Microsystems 1997. UltraSPARC-IIi User's Manual. Sun Microsystems.
[36]
Transitive Corporation Ltd. 2006. Transitive website. http://www.transitive.com/.
[37]
Ung, D. and Cifuentes, C. 2000. Machine-Adaptable dynamic binary translation. In Proceedings of the ACM Workshop on Dynamic Optimization (Dynamo'00).
[38]
Witchel, E. and Rosenblum, M. 1996. Embra: Fast and flexible machine simulation. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. 68--79.
[39]
Zheng, C. and Thompson, C. 2000. PA-RISC to IA-64: Transparent execution, no recompilation. IEEE Comput. 33, 3, 47--52.
[40]
Zhou, S., Childers, B. R., and Soffa, M. L. 2005. Planning for code buffer management in distributed virtual execution environments. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (VEE'05). ACM Press, New York, 100--109.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 2
July 2011
113 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/1970386
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2011
Accepted: 01 February 2011
Revised: 01 September 2010
Received: 01 May 2009
Published in TACO Volume 8, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Fast returns
  2. IBTC
  3. indirect branch
  4. indirect jump
  5. return cache
  6. sieve
  7. software dynamic translation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)76
  • Downloads (Last 6 weeks)17
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Profile-guided optimisation for indirect branches in a binary translatorConnection Science10.1080/09540091.2022.204155534:1(749-765)Online publication date: 19-Feb-2022
  • (2022)Hyperchaining for LLVM-Based Binary Translators on the x86-64 PlatformJournal of Signal Processing Systems10.1007/s11265-022-01803-194:12(1569-1589)Online publication date: 1-Dec-2022
  • (2015)HERMESProceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization10.5555/2738600.2738631(246-256)Online publication date: 7-Feb-2015
  • (2015)Hermes: A fast cross-ISA binary translator with post-optimization2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO.2015.7054204(246-256)Online publication date: Feb-2015
  • (2014)SPTUProceedings of International Conference on Systems and Storage10.1145/2611354.2611368(1-12)Online publication date: 30-Jun-2014
  • (2014)DTTProceedings of the 11th ACM Conference on Computing Frontiers10.1145/2597917.2597944(1-10)Online publication date: 20-May-2014
  • (2014)Accurate off-line phase classification for HW/SW co-designed processorsProceedings of the 11th ACM Conference on Computing Frontiers10.1145/2597917.2597937(1-10)Online publication date: 20-May-2014
  • (2014)Warm-Up Simulation Methodology for HW/SW Co-Designed ProcessorsProceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization10.1145/2581122.2544142(284-294)Online publication date: 15-Feb-2014
  • (2014)Warm-Up Simulation Methodology for HW/SW Co-Designed ProcessorsProceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization10.1145/2544137.2544142(284-294)Online publication date: 15-Feb-2014
  • (2014)Fast Dynamic Binary Rewriting for flexible thread migration on shared-ISA heterogeneous MPSoCs2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV)10.1109/SAMOS.2014.6893207(156-163)Online publication date: Jul-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media