Abstract
We present several new compiler techniques employed by our interprocedural parallelizing research compiler, Panorama, to improve loop parallelization and the efficiency of memory references. We first present an overview of the compiler and its associated memory architecture simulation environments. We then present an interprocedural array dataflow analysis, using guarded array regions, for automatic array privatization, an interprocedural static profile analysis, and a graph reduction algorithm for parallel task assignment and data allocation which aims at reducing remote memory references while maintaining loop parallelism.
Sponsored in part by U.S. Army, Army Research Laboratory, Army HPC Research Center. No official endorsement should be inferred. This work is also supported in part by National Science Foundation, grant CCR-9210913, and by Computing Devices, International.
Preview
Unable to display preview. Download preview PDF.
References
J. M. Anderson and M. S. Lam. Global optimizations for parallelism and locality on scalable parallel machines. In Proc. ACM SIGPLAN Conf. on Prog. Lang. Design and Imp., pages 112–125, June 1993.
V. Balasundaram. A mechanism for keeping useful internal information in parallel programming tools: The data access descriptor. J. of Parallel and Distributed Computing, 9:154–170, 1990.
W. Blume and R. Eigenmann. Symbolic analysis techniques needed or the effective parallelization of perfect benchmarks. Technical report, Dept. of Computer Science, University of Illinois, 1994.
D. Callahan and K. Kennedy. Analysis of interprocedural side effects in a parallel programming environment. In ACM SIGPLAN '86 Symp. Compiler Construction, pages 162–175, June 1986.
E. Duesterwald, R. Gupta, and M. L. Soffa. A practical data flow framework for array reference analysis and its use in optimizations. In Proc. ACM SIGPLAN Conf. on Prog. Lang. Design and Imp., pages 68–77, June 1993.
P. Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 2(1):23–53, February 1991.
S.L. Graham, S. Lucco, and O. Sharp. Orchestrating interactions among parallel computations. In Proc. ACM SIGPLAN Conf. on Prog. Lang. Design and Imp., pages 100–111, June 1993.
E.D. Granston and A.V. Veidenbaum. Detecting redundant accesses to array data. In Proc. Supercomputing '91, November 1991.
T. Gross and P. Steenkiste. Structured dataflow analysis for arrays and its use in an optimizing compiler. Software — Practice and Experience, 20(2):133–155, February 1990.
J. Gu, Z. Li, and G. Lee. Symbolic array dataflow analysis for array privatization and program parallelization. In Proc. Supercomputing '95, December 1995.
W. H. Harrison. Compiler analysis of the value ranges for variables. IEEE Trans. on Software Engineering, SE-3(3):243–250, May 1977.
P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Trans. on Par. and Dist. Systems, 2(3), 1991.
F. Irigoin, P. Jouvelot, and R. Triolet. Semantical interprocedural parallelization: An overview of the pips project. In Proc. Int. Conf. on Supercomputing, pages 244–251, 1991.
D. J. Kuck, E. S. Davidson, D. J. Lawrie, and A. H. Sameh. Parallel supercomputing today and the Cedar approach. Science, 231:967–974, February 1986.
D. Lenoski, K. Gharachorloo, J. Laudon, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. The Stanford DASH multiprocessor. Computer, pages 63–79, March 1992.
J. Li and M. Chen. The data alignment phase in compiling programs for distributed-memory machines. J. Par. and Dist. Computing, 13:213–221, 1991.
Z. Li. Array privatization for parallel execution of loops. In Proc. Int. Conf. on Supercomputing, July 1992.
Z. Li. Propagating symbolic relations on an interprocedural and hierarchical control flow graph. Technical Report CSci-93-87, University of Minnesota, 1993.
Z. Li and T. N. Nguyen. An empricial study of the work load distribution under static scheduling. In Proc. Int. Conf. on Par. Processing, volume II: Software, St. Charles, IL, 1994.
V. Maslov. Lazy array data-flow dependence analysis. In Proc. of Annual ACM Symp. on Principles of Programming Languages, pages 331–325, Jan. 1994.
D. E. Maydan. Accurate Analysis of Array References. PhD thesis, Stanford University, October 1992.
D. E. Maydan, S. P. Amarasinghe, and M. S. Lam. Array data-flow analysis and its use in array privatization. In Proc. of the 20th ACM Symp. on Principles of Programming Languages, pages 2–15, January 1993.
T. N. Nguyen, Z. Li, and D. J. Lilja. Efficient use of dynamically tagged directories through compiler analysis. In Proc. Int. Conf. on Par. Processing, volume II: Software, pages 112–119, St. Charles, IL, 1993.
T. N. Nguyen, F. Mounes-Toussi, D. J. Lilja, and Z. Li. A compiler-assisted scheme for adaptive cache coherence enforcement. In Proc. Int. Conf. on Par. Arch. and Compilation Techniques, pages 69–78, 1994.
J. H. Reif and H. R. Lewis. Symbolic evaluation and the global value graph. In Conf. Record of the Fourth ACM Symp. on Principles of Programming Languages, pages 104–118, 1977.
C. Rosene. Incremental dependence analysis. Technical Report CRPC-TR90044, PhD thesis, Computer Science Department, Rice University, March 1990.
M. D. Smith. Tracing with pixie. Technical Report CSL-TR-91-497, Stanford University, November 1991.
P. Stenstrom, J. Truman, and A. Gupta. Comparative performance evaluation of cache-coherent NUMA and COMA architectures. In Proc. Int. Sym. on Comp. Arch., pages 80–91, 1992.
R. Triolet, F. Irigoin, and P. Feautrier. Direct parallelization of CALL statments. In ACM SIGPLAN '86 Sym. on Compiler Construction, pages 176–185, July 1986.
P. Tu and D. Padua. Automatic array privatization. In Proc. Lang. and Compilers for Par. Computing, pages 500–521, August 1993.
P. Tu and D. Padua. Gated SSA-Based demand-driven symbolic analysis for parallelizing compilers. In Proc. Int. Conf. on Supercomputing, pages 414–423, July 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, T., Gu, J., Li, Z. (1996). An interprocedural parallelizing compiler and its support for memory hierarchy research. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014194
Download citation
DOI: https://doi.org/10.1007/BFb0014194
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive