Abstract
Superscalar processors increasingly use compile-time instruction scheduling to reorder code for parallel execution at run time. Although significant speedups have been achieved, the scheduling algorithms used to reorder code all, either explicitly or implicitly, introduce barriers to code motion which in turn limit performance. In this paper we use trace driven simulation to quantify the impact of various barriers to code motion introduced during instruction scheduling. The results will be used to direct our future research into instruction scheduling technology.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aho, A.V., Sethi R., Ullman J.D.: Compilers principles, techniques and tools. Addison-Wesley (1986)
Butler, M., Yeh T., Patt, Y.: Single instruction stream parallelism is greater than two. 18th ISCA, Toronto, Canada (May 1991) 276–286
Collins, R.: Exploiting Instruction-level parallelism in a superscalar architecture. Ph.D. Thesis, University of Hertfordshire (October 1995)
Collins, R., Steven, G.B.: Instruction scheduling for a superscalar architecture. To be presented at Euromicro96, Prague, (September 1996)
Ebcioglu, K., Groves, R.D., Kim, K-C., Silberman, G.M., Ziv, I.: VLIW compilation techniques in a superscalar environment. SIGPLAN 94, Orlando, Florida (1994) 36–48
Fisher, J.A.: Trace scheduling: A technique for global microcode compaction. IEEE Trans. on Comp. C-30, No7 (July 1981) 478–490
Hank, R.E., Mahlke, S.A., Bringmann, R.A., Gyllenhaal, J.C., Hwu, W.W.: Superblock formation using static program analysis. Micro26, Austin, Texas (December 1993) 247–255
Huang, A.S., Slavenburg, G., Shen, J.P.: Speculative disambiguation: A compilation technique for dynamic memory disambiguation. 21st ISCA, Chicago, (April 1994) 200-210
Johnson, M.: Superscalar microprocessor design. Prentice-Hall (1991)
Lam, M., Wilson, P.R.: Limits of control flow on parallelism. 19th ISCA, Gold Coast, Australia, (May 1992) 46–57
Nicolau, A., Fisher, J.A.: Measuring the parallelism available for very long instruction word architectures. IEEE Trans. on Comp., C-33 (November 1984) 968–976
Rau, B.R.: Iterative modulo scheduling: An algorithm for software pipelining loops. Micro27, San Jose, California (November 1994) 63–74
Riseman, E.M., Foster, C.C.: The inhibition of potential parallelism by conditional jumps. IEEE Trans. on Comp., C-22 (December 1972) 1404–1411
Steven, G.B.: The Hatfield superscalar architecture. Division of Computer Science Technical Report, University of Hertfordshire (1994)
Theobald, K.B., Gao, G.R., Hendren, L.J.: On the limits of program parallelism and its smoothability. Micro25, Portland, Oregon (December 1992) 10–19
Wall, D.: Limits of instruction-level parallelism. ASPLOS IV, Santa Clara, California (April 1991) 176–188
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Potter, R., Steven, G. (1996). Investigating the limits of fine-grained parallelism in a statically scheduled superscalar architecture. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds) Euro-Par'96 Parallel Processing. Euro-Par 1996. Lecture Notes in Computer Science, vol 1124. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0024777
Download citation
DOI: https://doi.org/10.1007/BFb0024777
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61627-6
Online ISBN: 978-3-540-70636-6
eBook Packages: Springer Book Archive