A Study of Performance Scalability by Parallelizing Loop Iterations on Multi-core SMPs

Prakash Raghavendra²⁰,
Akshay Kumar Behki²⁰,
K. Hariprasad²⁰,
Madhav Mohan²⁰,
Praveen Jain²⁰,
Srivatsa S. Bhat²⁰,
V. M. Thejus²⁰ &
…
Vishnumurthy Prabhu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6081))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1856 Accesses

Abstract

Today, the challenge is to exploit the parallelism available in the way of multi-core architectures by the software. This could be done by re-writing the application, by exploiting the hardware capabilities or expect the compiler/software runtime tools to do the job for us. With the advent of multi-core architectures ([1] [2]), this problem is becoming more and more relevant. Even today, there are not many run-time tools to analyze the behavioral pattern of such performance critical applications, and to re-compile them. So, techniques like OpenMP for shared memory programs are still useful in exploiting parallelism in the machine. This work tries to study if the loop parallelization (both with and without applying transformations) can be a good case for running scientific programs efficiently on such multi-core architectures. We have found the results to be encouraging and we strongly feel that this could lead to some good results if implemented fully in a production compiler for multi-core architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications

Performance Evaluation of OSCAR Multi-target Automatic Parallelizing Compiler on Intel, AMD, Arm and RISC-V Multicores

Toward Heterogeneous MPI+MPI Programming: Comparison of OpenMP and MPI Shared Memory Models

References

AMD Multi-core Products (2006), http://multicore.amd.com/en/products/
Multi-core from Intel Products and Platforms (2006), http://www.intel.com/products/processor/
OpenMP, http://www.openmp.org
Wolfe, M.J.: Techniques for improving the inherent parallelism in programs. Technical Report 78-929, Department of Computer Science, University of Illinois at Urbana-Champaign (July 1990)
Google Scholar
Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading
Google Scholar
Banerjee, U.K.: Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Norwell (1993)
MATH Google Scholar
Banerjee, U.K.: Loop Parallelization. Kluwer Academic Publishers, Norwell (1994)
MATH Google Scholar
Pthreads reference, https://computing.llnl.gov/tutorials/pthreads/
DHollander, E.H.: Partitioning and Labelling of loops by Unimodular Transformation. IEEE Transactions on Parallel and Distributed Systems 3(4) (1992)
Google Scholar
Saas, R., Mutka, M.: Enabling unimodular transformation. In: Supercomputing 1994, November 1994, pp. 753–762 (1994)
Google Scholar
Banerjee, U.: Unimodular Transformations of Double Loop. In: Advances in Languages and Compilers for Parallel Processing, pp. 192–219 (1991)
Google Scholar
Prakash, S.R., Srikant, Y.N.: An Approach to Global Data Partitioning for Distributed Memory Machines. In: IPPS/SPDP (1999)
Google Scholar
Prakash, S.R., Srikant, Y.N.: Communication Cost Estimation and Global Data Partitioning for Distributed Memory Machines. In: Fourth International Conference on High Performance Computing, Bangalore (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, National Institute of Technology Karnataka, Surathkal
Prakash Raghavendra, Akshay Kumar Behki, K. Hariprasad, Madhav Mohan, Praveen Jain, Srivatsa S. Bhat, V. M. Thejus & Vishnumurthy Prabhu

Authors

Prakash Raghavendra
View author publications
You can also search for this author in PubMed Google Scholar
Akshay Kumar Behki
View author publications
You can also search for this author in PubMed Google Scholar
K. Hariprasad
View author publications
You can also search for this author in PubMed Google Scholar
Madhav Mohan
View author publications
You can also search for this author in PubMed Google Scholar
Praveen Jain
View author publications
You can also search for this author in PubMed Google Scholar
Srivatsa S. Bhat
View author publications
You can also search for this author in PubMed Google Scholar
V. M. Thejus
View author publications
You can also search for this author in PubMed Google Scholar
Vishnumurthy Prabhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Engineering, Chung Hua University, 300, Hsinchu, Taiwan, China
Ching-Hsien Hsu
Department of Computer Science, St. Francis Xavier University, B2G 2W5, Antigonish, NS, Canada
Laurence T. Yang
Department of Computer Science ad Engineering, Seoul National University of Technology, 172 Gongreund 2-dong, Nowon-gou, 139-742, Seoul, Korea
Jong Hyuk Park
Division of Computer Engineering, Mokwon University, 302-729, Daejeon, Korea
Sang-Soo Yeo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raghavendra, P. et al. (2010). A Study of Performance Scalability by Parallelizing Loop Iterations on Multi-core SMPs. In: Hsu, CH., Yang, L.T., Park, J.H., Yeo, SS. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2010. Lecture Notes in Computer Science, vol 6081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13119-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-13119-6_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13118-9
Online ISBN: 978-3-642-13119-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Study of Performance Scalability by Parallelizing Loop Iterations on Multi-core SMPs

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications

Performance Evaluation of OSCAR Multi-target Automatic Parallelizing Compiler on Intel, AMD, Arm and RISC-V Multicores

Toward Heterogeneous MPI+MPI Programming: Comparison of OpenMP and MPI Shared Memory Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Study of Performance Scalability by Parallelizing Loop Iterations on Multi-core SMPs

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications

Performance Evaluation of OSCAR Multi-target Automatic Parallelizing Compiler on Intel, AMD, Arm and RISC-V Multicores

Toward Heterogeneous MPI+MPI Programming: Comparison of OpenMP and MPI Shared Memory Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation