Abstract
Dynamic optimization has been proposed to overcome many limitations caused by static optimization and is widely applied in dynamic binary translation (DBT) to effectively enhance system performance. However, almost all the existing dynamic optimization techniques or methods employed in DBT systems for a single-threaded executive environment considerably increase the complexity of the hardware or the striking runtime overhead. We propose a multithreaded DBT framework with no associated hardware called the MTCrossBit, where a helper thread for building a hot trace is employed to significantly reduce the overhead. In addition, the main thread and helper thread are each assigned to different cores to use the multi-core resources efficiently to attain better performance, two novel methods yet to be implemented in the MTCrossBit are presented: the dual-special-parallel translation caches and the new lock-free threads communication mechanism—assembly language communication (ASLC). We then apply quantitative analysis to prove that MTCrossBit can speed up the original CrossBit. Simultaneously, we present results from the implementation of the MTCrossBit on the uniprocessor machines with multi-cores utilizing the benchmark-SPECint 2000, and illustrate that we achieved some success with the above concurrent architecture.
Similar content being viewed by others
References
Bala V, Duesterwald E, Banerjia S. Dynamo: A transparent dynamic optimization system. In: Proc ACM SIGPLAN Conf on Programming Language Design and Implementation. Vancouver, British Columbia, Canada, 2000
Shankar A, Sastry S, Bodik R, et al. Runtime specialization with optimistic heap analysis. In: Proc ACM Conf on Object-Oriented Programming Systems, Languages and Applications. San Diego, CA, USA, 2005
Rosner R, Almog Y, Moffie M, et al. Power awareness through selective dynamically optimized traces. In: Proc Int Symp on Computer Architecture. Munchen, Germany, 2004
Baraz L, Devor T, Etzion O, et al. IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium(R)-based systems. In: Proc Int Symp on Microarchitecture. San Diego, CA, USA, 2003
Zhang W F, Brad C, Tullsen D M. An event-driven multithreaded dynamic optimization framework. In: Proc Int Conf on Parallel Architectures and Compilation Techniques. Saint Louis, MO, USA, 2005
Sorav B, Alex A. Automatic generation of peephole superoptimizers. In: Proc Int Conf on Architectural Support for Programming Languages and Operating Systems. San Jose, CA, USA, 2006
Pozzi L, Atasu K, Ienne P. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans Comput Aid D, 2006, 25: 1209–1229
Lupo C, Wilken K D. Post register allocation spill code optimization. In: Proc Int Symp on Code Generation and Optimization. Manhattan, New York, USA, 2006
Sorav B, Alex A. Binary translation using peephole superoptimizers. In: Proc USENIX Symp on Operating Systems Design and Implementation, San Diego, CA, USA, 2008
Source codes and Introduction of CrossBit, http://sourceforge.net/projects/crossbit/
Li X L, Zheng D E, Ma R H, MTCrossBit: A dynamic binary translation system using multithreaded optimization framework. In: Proc Int Conf on Algorithms and Architectures for Parallel Processing. Taipei, Taiwan, China, 2009
Shi H H, Wang Y, Guan H B, et al. An intermediate language level optimization framework for dynamic binary translation. ACM SIGPLAN Notices, 2007, 42: 3–9
Hiser J D, Williams D, Hu W, et al. Evaluating indirect branch handling mechanisms in software dynamic translation systems. In: Proc Int Symp on Code Generation and Optimization. San Jose, CA, USA, 2007
Bellard F. QEMU, a fast and portable dynamic translator. In: Proc USENIX Annual Technical Conf. Anaheim, CA, USA, 2005
Robson D, Strazdins P. Parallelisation of the valgrind dynamic binary instrumentation framework. In: Proc Int Symp on Parallel and Distributed Processing with Applications. Sydney, Australia, 2008
Pang Y, Hu W D, Sun L F, et al. Adaptive data-driven parallelization of multiview video coding on multi-core processor. Sci China Ser F-Inf Sci, 2009, 52: 195–205
Scott K, Kumar N, Velusamy S, et al. Retargetable and reconfigurable software dynamic translation. In: Proc Int Symp on Code Generation and Optimization. San Francisco, CA, USA, 2003
Lu J, Chen H, Yew P, et al. Design and implementation of a lightweight dynamic optimization system. J Instruction-Level Parall, 2004, 6: 1–24
Tera-scale Research Prototype: Connecting 80 simple sores on a single est chip ftp://download.intel.com/research/platform/terascale/tera-scaleresearchprototypebackgrounder.pdf
Dorsey J, Searles S, Ciraula M, et al. An integrated quad-core Opteron processor. In: Proc of Int Solid State Circuits Conf. San Francisco, California, USA, 2007
Wells P, Chakraborty K, Sohi G. Dynamic heterogeneity and the need for multicore virtualization. ACM SIGOPS Oper Syst Rev, 2009, 2: 5–14
Stallings W. Operating Systems: Internals and Design Principles 2008. 6th ed. Prentice Hall, 2008
SPEC CPU2000 Documentation, http://www.spec.org/osg/cpu2000/docs/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guan, H., Ma, R., Yang, H. et al. MTCrossBit: A dynamic binary translation system based on multithreaded optimization. Sci. China Inf. Sci. 54, 2064–2078 (2011). https://doi.org/10.1007/s11432-011-4414-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-011-4414-5