Abstract
The huge processing power needed by multimedia applications has led to multimedia extensions in the instruction set of microprocessors which exploit subword parallelism. Examples of these extended instruction sets are the Visual Instruction Set of the UltraSPARC processor, the AltiVec instruction set of the PowerPC processor, the MMX and ISS extensions of the Pentium processors, and the MAX-2 instruction set of the HP PA-RISC processor. Currently, these extensions can only be used by programs written in assembly language, through system libraries or by calling specialized macros in a high-level language. Therefore, these instructions are not used by most applications. We propose two code generation techniques to produce native code using these multimedia extensions for programs written in a high-level language: classical vectorization and vectorization by unrolling. Vectorization by unrolling is simpler than classical vectorization since data dependence analysis is reduced to acyclic control flow graph analysis. Furthermore, we address the problem of unaligned memory accesses. This can be handled by both static analysis and dynamic runtime checking. Preliminary experimental results for a code generator for the UltraSPARC VIS instruction set show that speedups of up to a factor of 4.8 are possible, and that vectorization by unrolling is much simpler but as effective as classical vectorization.
Similar content being viewed by others
REFERENCES
Craig Hansen, MicroUnity's Media Processor archtecture, IEEE Micro, 16(4):34–41 (August 1996).
André Seznec and Fabien Lloansi, Etude des architectures des microprocesseurs MIPS R10000, UltraSparc et PentiumPro, Technical Report 1024, IRISA, Rennes (May 1996).
J. Tyler, J. Lent, A. Mather, and Huy Nguyen, AltiVec: Bringing vector technology to the PowerPC processor family, Int'l. Performance, Computing and Communications Conf., IEEE, pp. 437–444 (1999).
Alex Peleg and Uri Weiser, MMX technology extension to the Intel architecture, IEEE Micro, 16(4):42–50 (August 1996).
Ruby B. Lee, Subword parallelism with MAX-2, IEEE Micro, 16(4):51–59 (August 1996).
Helmut Emmelmann, Friedrich Schröer, and Rudolf Landwehr, BEG-A generator for efficient back ends, Conf. Progr. Lang. Design and Implementation, Vol. 24, No. 7, pp. 227–237, Portland, ACM (1989).
Gerald Cheong and Monica S. Lam, An optimizer for multimedia instruction sets, Second SUIF Compiler Workshop, Stanford (August 1997).
N. Sreraman and R. Govindarajan, A vectorizing compiler for multimedia extensions, Dagstuhl Seminar on Instruction Level Parallelism and Parallelizing Compilation (April 1999).
Randy Allen and Ken Kennedy, Vector register allocation, IEEE Trans. Computers, 41(10):1290–1317 (October 1992).
Hans Zima and Barbara Chapman, Supercompilers for Parallel and Vector Computers, Addison-Wesley (1990).
Michael Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley (1996).
David F. Bacon, Susan L. Graham, and Oliver J. Sharp, Compiler transformations for high-performance computing, ACM Computing Surveys, 26(4):325–420 (December 1994).
Jean-Claude Sogno, The Janus test: A hierarchical algorithm for computing direction and distance vectors, 29th Hawaii Int'l. Conf. Syst. Sci., Maui, Hawaiki (January 1996).
Randy Allen and Ken Kennedy, Automatic translation of FORTRAN programs to vector form, ACM TOPLAS, 9(4):491–542 (October 1987).
Marc Shapiro and Susan Horwitz, Fast and accurate flow-insensituve points-to analysis, POPL'97: 24th Symp. Principles of Progr. Lang., ACM, Paris, pp. 1–14 (January 1997).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Krall, A., Lelait, S. Compilation Techniques for Multimedia Processors. International Journal of Parallel Programming 28, 347–361 (2000). https://doi.org/10.1023/A:1007507005174
Issue Date:
DOI: https://doi.org/10.1023/A:1007507005174