Abstract
We describe a new dynamic software scheduling technique for VLIW architectures, which compiles into VLIW code the program paths that are actually executed. Unlike trace processors, or DIF, the technique executes operations speculatively on multiple paths through the code, is resilient to branch mispredictions, and can achieve very large dynamic window sizes necessary for high ILP. Aggressive optimizations are applied to frequently executed portions of the code. Encouraging performance results were obtained on SPECint95 and TPC-C. The technique can be used for binary translation for achieving architectural compatibility with an existing processor, or as a VLIW scheduling technique in its own right.
Chapter PDF
References
K. Ebcioğlu and E. Altman. DAISY: Dynamic Compilation for 100% Architectural Compatibility. Research Report RC 20538, IBM T.J. Watson Research Center, Yorktown Heights, NY, 1996.
K. Ebcioğlu and E. Altman. DAISY: Dynamic Compilation for 100% Architectural Compatibility. In Proc. of the 24th Annual International Symposium on Computer Architecture, pages 26–37, Denver, CO, June 1997. ACM.
K. Ebcioğlu. Some Design Ideas for a VLIW Architecture for Sequential-Natured Software. In M. Cosnard et al., editor, Parallel Processing, pages 3–21. North-Holland, 1988. (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing).
G.M. Silberman and K. Ebcioğlu. An Architectural Framework for Migration from CISC to Higher Performance Platforms. In Proc of the 1992 International Conference on Supercomputing, pages 198–215, Washington, DC, July 1992. ACM Press.
G.M. Silberman and K. Ebcioğlu. An Architectural Framework for Supporting Heterogeneous Instruction-Set Architectures. IEEE Computer, 26(6):39–56, June 1993.
Sun Microsystems. The Java Hotspot Performance Engine Architecture. http://java.sun.com/products/hotspot/whitepaper.html, April 1999.
K. Ebcioğlu, J. Fritts, S. Kosonocky, M. Gschwind, E. Altman, K. Kailas, and T. Bright. An eight-issue tree-VLIW processor for dynamic binary translation. In Proc. of the 1998 International Conference on Computer Design (ICCD’ 98)-VLSI in Computers and Processors, pages 488–495, Austin, TX, October 1998. IEEE Computer Society.
A. Chernoff, M. Herdeg, R. Hookway, C. Reeve, N. Rubin, T. Tye, S.B. Yadavalli, and J. Yates. FX!32-A Profile-Directed Binary Translator. IEEE Micro, 18(2):56–64, March 1998.
M. Rosenblum, S. Herrod, E. Witchel, and A. Gupta. Complete Computer Simulation: The SimOS Approach. IEEE Parallel and Distributed Technology, 3(4):34–43, Winter 1995.
R. Nair and M. Hopkins. Exploiting Instruction Level Parallelism in Processors by Caching Scheduled Groups. In Proc of the 24th Annual International Symposium on Computer Architecture, pages 13–25, Denver, CO, June 1997. ACM.
E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. Smith. Trace Processors. In Proc. of the 30th Annual International Symposium on Microarchitecture, pages 138–148, Research Triangle Park, NC, December 1997. IEEE Computer Society.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ebcioğlu, K., Altman, E.R., Sathaye, S., Gschwind, M. (1999). Execution-Based Scheduling for VLIW Architectures. In: Amestoy, P., et al. Euro-Par’99 Parallel Processing. Euro-Par 1999. Lecture Notes in Computer Science, vol 1685. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48311-X_181
Download citation
DOI: https://doi.org/10.1007/3-540-48311-X_181
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66443-7
Online ISBN: 978-3-540-48311-3
eBook Packages: Springer Book Archive