A Cost-Aware Parallel Workload Allocation Approach Based on Machine Learning Techniques

Shun Long¹,
Grigori Fursin² &
Björn Franke³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4672))

Included in the following conference series:

IFIP International Conference on Network and Parallel Computing

839 Accesses
13 Citations

Abstract

Parallelism is one of the main sources for performance improvement in modern computing environment, but the efficient exploitation of the available parallelism depends on a number of parameters. Determining the optimum number of threads for a given data parallel loop, for example, is a difficult problem and dependent on the specific parallel platform. This paper presents a learning-based approach to parallel workload allocation in a cost-aware manner. This approach uses static program features to classify programs, before deciding the best workload allocation scheme based on its prior experience with similar programs. Experimental results on 12 Java benchmarks (76 test cases with different workloads in total) show that it can efficiently allocate the parallel workload among Java threads and achieve an efficiency of 86% on average.

Download to read the full chapter text

Chapter PDF

Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review

Article Open access 26 April 2018

Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs

Runtime prediction of parallel applications with workload-aware clustering

Article 06 April 2017

Keywords

References

Agakov, F., Bonilla, E., Cavazos, J., et al.: Using machine learning to focus iterative optimization. In: Proc. of the 2006 International Symposium on Code Generation and Optimization (2006)
Google Scholar
Arnold, M., Hind, M., Ryhder, B.: Online feedback-directed Java optimization. ACM SIGPLAN Notices 37(11) (2002)
Google Scholar
Artigas, P., Gupta, M., Midkiff, S., Moreira, J.: Automatic loop transformation and parallelization for Java. In: Proc. of the 14th International Conference for Supercomputing (2000)
Google Scholar
Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.: Automatic program parallelization. Proceedings of the IEEE 81(2) (1993)
Google Scholar
Blume, W., Eigenmann, R., Hoeflinger, J., et al.: Automatic detection of parallelism, a grand challenge for high performance computing. IEEE Parallel and Distributed Technology 2(3) (1994)
Google Scholar
Cavazos, J., Dubach, C., Agakov, F., et al.: Automatic performance model construction for the fast software exploration of new hardware design. In: CASES 2006. Proc. of International Conference on Compilers, Architecture and Synthesis for Embedded Systems (2006)
Google Scholar
Cavazos, J., O’Boyle, M.: Method-specific dynamic compilation using logistic regression. In: OOPSLA 2006. Proc. of ACM SIGPLAN Conferences on Object-Oriented Programming, Systems, Languages, and Applications, ACM Press, New York (2006)
Google Scholar
Chen, M., Olukotun, K.: The Jrpm System for Dynamically Parallelizing Java Programs. ACM SIGARCH Computer Architecture News 31(2) (2003)
Google Scholar
Cintra, M., Martinez, J., Torrellas, J.: Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In: Intl. Symp. on Computer Architecture (ISCA) (2000)
Google Scholar
Cooper, K., Subranmanian, D., Torzon, L.: Adaptive optimizing compilers for the 21st century. Journal of Supercomputing 23(1) (2001)
Google Scholar
Dongarra, J., Foster, I., Fox, G., Gropp, W., Kennedy, K., Torzon, L., White, A.: Sourcebook of parallel computing. Morgan Kaufmann, US (2003)
Google Scholar
Gupta, M., Nim, R.: Techniques for speculative run-time parallelization of loops. In: Proc. of Supercomputing 1998 (1998)
Google Scholar
Long, S., O’Boyle, M.: Adaptive Java optimization using instance-based learning. In: Proc. of the 18th ACM International Conference on Supercomputing, France (2004)
Google Scholar
Marcuello, P., Gonzales, A., Tubella, J.: Speculative Multithreaded processors. In: Proc. of the 1998 ACM International Conference on Supercomputing (1998)
Google Scholar
Mitchell, T.: Machine learning. McGraw-Hill, US (1997)
MATH Google Scholar
Oplinger, J., Heine, D., Lam, M.: In search of speculative thread-level parallelism. In: Malyshkin, V. (ed.) Parallel Computing Technologies. LNCS, vol. 1662, Springer, Heidelberg (1999)
Google Scholar
Wright, G., El-Mahdy, A., Watson, I.: Java machine and integrated circuit architecture, Java Microarchitecture. Kluwer, Dordrecht (2002)
Google Scholar
Yang, L., Schopf, J., Foster, I.: Conservative scheduling: using predicted variance to improve scheduling decisions in dynamic environments. In: Proc. of Scientific Computing (2003)
Google Scholar
Zhao, J., Kirkham, C., Rogers526, I.: Lazy interprocedural analysis for dynamic loop parallelization. In: Proc. of Workshop on New Horizons in Compilers, India (2006)
Google Scholar
Zhao, J., Rogers, I., Kirkham, C., Watson, I.: Loop parallelization for the Jikes RVM. In: PDCAT 2005. Proc. of the 6th International Conference on Parallel and Distributed Computing, Applications and Technologies (2005)
Google Scholar
Zhu, W., Wang, C.L., Lau, C.M.: JESSICA2: a distributed Java virtual machine with transparent thread migration support. In: proc. of IEEE 4th International Conference on Cluster Computing (2002)
Google Scholar
Zivojnovic, V., et al.: DSPstone: a DSP-oriented benchmarking methodology. In: Proc. of Signal Processing Applications & Technology (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Jinan University, Guangzhou 510632, P.R. China
Shun Long
Member of HiPEAC, ALCHEMY Group, INRIA Futurs and LRI, Paris-Sud University, France
Grigori Fursin
Member of HiPEAC, Institute for Computing Systems Architecture, The University of Edinburgh, UK
Björn Franke

Authors

Shun Long
View author publications
You can also search for this author in PubMed Google Scholar
Grigori Fursin
View author publications
You can also search for this author in PubMed Google Scholar
Björn Franke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Keqiu Li Chris Jesshope Hai Jin Jean-Luc Gaudiot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Long, S., Fursin, G., Franke, B. (2007). A Cost-Aware Parallel Workload Allocation Approach Based on Machine Learning Techniques. In: Li, K., Jesshope, C., Jin, H., Gaudiot, JL. (eds) Network and Parallel Computing. NPC 2007. Lecture Notes in Computer Science, vol 4672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74784-0_51

Download citation

DOI: https://doi.org/10.1007/978-3-540-74784-0_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74783-3
Online ISBN: 978-3-540-74784-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Cost-Aware Parallel Workload Allocation Approach Based on Machine Learning Techniques

Abstract

Chapter PDF

Similar content being viewed by others

Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review

Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs

Runtime prediction of parallel applications with workload-aware clustering

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Cost-Aware Parallel Workload Allocation Approach Based on Machine Learning Techniques

Abstract

Chapter PDF

Similar content being viewed by others

Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review

Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs

Runtime prediction of parallel applications with workload-aware clustering

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation