Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11859802_16guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Parallelizing user-defined and implicit reductions globally on multiprocessors

Published: 06 September 2006 Publication History

Abstract

Multiprocessors are becoming prevalent in the PC world. Major CPU vendors such as Intel and Advanced Micro Devices have migrated to multicore processors. However, this also means that computers will run an application at full speed only if that application is parallelized. To take advantage of more than a fraction of compute resource on a die, we develop a compiler to parallelize a common and powerful programming paradigm, namely reduction. Our goal is to exploit the full potential of reductions for efficient execution of applications on multiprocessors, including multicores. Note that reduction operations are common in streaming applications, financial computing and HPC domain. In fact, 9% of all MPI invocations in the NAS Parallel Benchmarks are reduction library calls. Recognizing implicit reductions in Fortran and C is important for parallelization on multiprocessors. Recent languages such as Brook Streaming language and Chapel language allow users to specify reduction functions. Our compiler provides a unified framework for processing both implicit and user-defined reductions. Both types of reductions are propagated and analyzed interprocedurally. Our global algorithm can enhance the scope of user-defined reductions and parallelize coarser-grained reductions. Thanking to the powerful algorithm and representation, we obtain an average speedup of 3 on 4 processors. The speedup is only 1.7 if only intraprocedural scalar reductions are parallelized.

References

[1]
Buck. Brook Language Specification. In http://merrimac.stanford.edu/brook, Oct. 2003.
[2]
S. Deitz, D. Callahan, B. Chamberlain, L. Snyder. Global-View Abstractions for User-Defined Reductions and Scans. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming. New York, New York, March 2006.
[3]
M. Hall, S. Amarasinghe, B. Murphy, S. Liao, M. Lam. Detecting Coarse-Grain Parallelism Using an Interprocedural Parallelizing Compiler. In Proceedings of Supercomputing, San Diego, CA, December 1995.
[4]
M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion and M. S. Lam. Maximizing Multiprocessor Performance with the SUIF Compiler. In IEEE Computer, 29(12), December 1996.
[5]
D. Bailey, T. Harris, W. Saphir, R. Van der Wijngaart, A. Woo, M. Yarrow. The NAS Parallel Benchmarks 2.0. Technical Report RNR-95-020, NASA Ames Research Center, Moffet Field, CA, December 1995.
[6]
G. E. Blelloch. Vector Models for Data Parallel Computing. MIIT Press, Cambridge MA. 1990.
[7]
Intel Multi-Core and AMD Multi-Core Technology. In http://www.intel.com/multi-core/ and http://multicore.amd.com/en/, June 2006.
[8]
K. Iverson. A Programming Language. John Wiley & Sons. 1962.
[9]
S. Liao, Z. Du, G. Wu, G. Lueh. Data and Computation Transformations for Brook Streaming Applications on Multiprocessors. In IEEE/ACM International Symposium on Code Generation and Optimization, New York, NY, March 2006.
[10]
High Performance Fortran Forum. High Performance Fortran Specification Version 2.0, January 1997.
[11]
W. Gropp, E. Lusk, A. Skjellum. Using MPI (2nd edition): Portable Parallel Programming with the Message-Passing Interface. MIT Press, 1999.
[12]
P. Charles, C. Donawa, K. Ebcioglu, C. Grothoff, A. Kielstra, C. von Praun, V. Saraswat, V. Sarkar. X10: An Object-oriented Approach to Non-uniform Cluster Computing. In Proceedings Of the Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA) - Onward! Track, October 2005.
[13]
Official OpenMP Specifications Version 2.5. In http://www.openmp.org, May 2005.
[14]
Fortress: A New Programming Language for Scientific Computing. In http:// research. sun.com/ projects/plrg/fortress0618.pdf, 2005.
[15]
Z. Ammarguellat and W. Harrison. Automatic Recognition of Induction Variables and Recurrence Relations by Abstract Interpretation. In Proceedings of the SIGPLAN '90 Conference on Programming Language Design and Implementation. White Plains, NY, June 1990.
[16]
M. Haghighat and C. Polychronopoulos. Symbolic Analysis: A Basis for Parallelization, Optimization and Scheduling of Programs" In Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, OR. August 1993. Springer-Verlag Lecture Notes in Computer Science.
[17]
M. Haghighat and C. Polychronopoulos. Symbolic Analysis for Parallelizing Compilers. In ACM Transactions on Programming Languages and Systems, Volume 18, Issue 4, July 1996.
[18]
B. Pottenger and R. Eigenmann. Parallelization in the Presence of Generalized Induction and Reduction Variables. In Proceedings of the 1995 ACM International Conference on Supercomputing, June 1995.
[19]
L. Pointer. Perfect: Performance Evaluation for Cost Effective Transformations Report 2. In Technical Report 964, University of Illinois, Urbana-Champaign, March 1990.

Cited By

View all
  • (2014)Translating imperative code to MapReduceACM SIGPLAN Notices10.1145/2714064.266022849:10(909-927)Online publication date: 15-Oct-2014
  • (2014)Translating imperative code to MapReduceProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660228(909-927)Online publication date: 15-Oct-2014
  • (2009)A translation system for enabling data mining applications on GPUsProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542331(400-409)Online publication date: 8-Jun-2009

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ACSAC'06: Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
September 2006
601 pages
ISBN:3540400567
  • Editors:
  • Chris Jesshope,
  • Colin Egan

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 06 September 2006

Author Tags

  1. data flow analysis
  2. implicit reductions
  3. interprocedural analysis
  4. multicore
  5. multiprocessor
  6. parallelization
  7. reduction
  8. reduction recognition
  9. user-defined reductions

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Translating imperative code to MapReduceACM SIGPLAN Notices10.1145/2714064.266022849:10(909-927)Online publication date: 15-Oct-2014
  • (2014)Translating imperative code to MapReduceProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660228(909-927)Online publication date: 15-Oct-2014
  • (2009)A translation system for enabling data mining applications on GPUsProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542331(400-409)Online publication date: 8-Jun-2009

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media