Hybrid static–dynamic selection of implementation alternatives in heterogeneous environments

D. del Rio Astorga¹,
Manuel F. Dolz¹,
Javier Fernandez¹ &
…
Javier Garcia Blas¹

233 Accesses
Explore all metrics

Abstract

With the emergence of heterogeneous architectures, developing parallel software has become an increasingly complex task. The ability of using multiple devices in a single application, such as CPUs, accelerators, or coprocessors, has turned the implementation and optimization tasks into a challenging process, which comes along with a variety of difficulties. The inherent complexities of the parallel algorithm, its multiple implementations, and the mapping possibilities onto one of the available processors are just examples of how intricate these tasks can become. To alleviate these issues, this paper proposes a hybrid static–dynamic selector to better exploit resources provided by heterogeneous systems. Specifically, this framework generates at compile time a decision tree based on historical information for selecting the implementation that performs best at run-time. To evaluate the benefits of this approach, we analyze the performance with two use cases: the general matrix–matrix multiplication and an image processing medical application. The experimental results demonstrate that our proposed selector enhances performance and minimizes efforts needed to tune applications. We proved that our solution improves from 10 to 24% the overall application performance in comparison with other similar approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probabilistic-Based Selection of Alternate Implementations for Heterogeneous Platforms

Whole procedure heterogeneous multiprocessors low-power optimization at algorithm-level

Article 23 February 2018

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures

References

Brodtkorb AR, Dyken C, Hagen TR, Hjelmervik JM, Storaasli OO (2010) State-of-the-art in heterogeneous computing. Sci Program 18(1):1–33. doi:10.1155/2010/540159
Google Scholar
Canales-Rodríguez EJ, Daducci A, Sotiropoulos SN, Caruyer E, Aja-Fernández S, Radua J, Mendizabal JMY, Iturria-Medina Y, Melie-García L, Alemán-Gómez Y et al (2015) Spherical deconvolution of multichannel diffusion MRI data with non-Gaussian noise models and spatial regularization. PloS One 10(10):e0138910
Article Google Scholar
clMathLibraries (2015) clBLAS. https://github.com/clMathLibraries/clBLAS
Daoud MI, Kharma N (2006) Efficient compile-time task scheduling for heterogeneous distributed computing systems. In: 12th International Conference on Parallel and Distributed Systems—(ICPADS’06), vol 1, 9 pp
Dastgeer U, Li L, Kessler C (2013) Adaptive implementation selection in the SkePU skeleton programming library. In: Advanced Parallel Processing Technologies: 10th International Symposium, APPT 2013, Stockholm, Sweden, 27–28 August 2013, Revised Selected Papers. Springer, Berlin, pp 170–183
Duran A, Ayguadé E, Badia RM, Labarta J, Martinell L, Martorell X, Planas J (2011) Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Process Lett 21:173–193. doi:10.1142/S0129626411000151
Article MathSciNet Google Scholar
Garcia-Blas J (2016) Parallel high angular resolution diffusion imaging toolbox. https://bitbucket.org/fjblas/phardi
Garcia-Blas J, Dolz MF, García JD, Carretero J, Daducci A, Alemán-Gómez Y, Canales-Rodríguez EJ (2016) Porting Matlab applications to high-performance C++ codes: CPU/GPU-accelerated spherical deconvolution of diffusion MRI data. In: Algorithms and Architectures for Parallel Processing—16th International Conference, ICA3PP 2016, Granada, Spain, 14–16 December 2016, Proceedings, pp 630–643. doi:10.1007/978-3-319-49583-5_49
Intel (2015) MKL—Math Kernel Library. https://software.intel.com/en-us/intel-mkl
Maurer J, Wong M (2008) Towards support for attributes in C++ (Revision 6). In: JTC1/SC22/WG21—The C++ Standards Committee. N2761=08-0271
nVidia (2012) cuBLAS library user guide. nVidia, v5.0 edn
Sotomayor R, Sanchez LM, Garcia-Blas J, Calderon A, Fernandez J (2015) AKI: automatic kernel identification and annotation tool based on C++ attributes. In: Proceedings of the IEEE TrustCom-BigDataSE-ISPA, pp 148–156
Sanchez LM, del Rio Astorga D, Dolz MF, Fernández J (2016) CID: a compile-time implementation decider for heterogeneous platforms based on C++ attributes. In: 2016 International IEEE Conference on Scalable Computing and Communications (ScalCom), pp 1149–1156. doi:10.1109/UIC-ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0177
Shen J, Varbanescu A, Sips H (2014) Look before you leap: using the right hardware resources to accelerate applications. In: 2014 IEEE 6th International Symposium on Cyberspace Safety and Security, 2014 IEEE 11th International Conference on Embedded Software and Systems (HPCC, CSS, ICESS), 2014 IEEE International Conference on High Performance Computing and Communications, pp 383–391
Su LT (2013) Architecting the future through heterogeneous computing. In: 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers, pp 8–11. doi:10.1109/ISSCC.2013.6487618
Tan WJ, Tang WT, Goh R, Turner S, Wong WF (2015) A code generation framework for targeting optimized library calls for multiple platforms. IEEE Trans Parallel Distrib Syst 26(7):1789–1799
Article Google Scholar
Zhong Z, Rychkov V, Lastovetsky A (2015) Data partitioning on multicore and multi-gpu platforms using functional performance models. IEEE Trans Comput 64(9):2506–2518. doi:10.1109/TC.2014.2375202
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work has been partially supported by the EU Project ICT 644235 “RePhrase: REfactoring Parallel Heterogeneous Resource-Aware Applications” and the Project TIN2016-79637-P “Towards Unification of HPC and Big Data Paradigms” from the Spanish “Ministerio de Economía y Competitividad”.

Author information

Authors and Affiliations

Department of Computer Science, Universidad Carlos III, 28911, Leganés, Madrid, Spain
D. del Rio Astorga, Manuel F. Dolz, Javier Fernandez & Javier Garcia Blas

Authors

D. del Rio Astorga
View author publications
You can also search for this author in PubMed Google Scholar
Manuel F. Dolz
View author publications
You can also search for this author in PubMed Google Scholar
Javier Fernandez
View author publications
You can also search for this author in PubMed Google Scholar
Javier Garcia Blas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel F. Dolz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

del Rio Astorga, D., Dolz, M.F., Fernandez, J. et al. Hybrid static–dynamic selection of implementation alternatives in heterogeneous environments. J Supercomput 75, 4098–4113 (2019). https://doi.org/10.1007/s11227-017-2147-y

Download citation

Published: 26 September 2017
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s11227-017-2147-y

Hybrid static–dynamic selection of implementation alternatives in heterogeneous environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Probabilistic-Based Selection of Alternate Implementations for Heterogeneous Platforms

Whole procedure heterogeneous multiprocessors low-power optimization at algorithm-level

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Hybrid static–dynamic selection of implementation alternatives in heterogeneous environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Probabilistic-Based Selection of Alternate Implementations for Heterogeneous Platforms

Whole procedure heterogeneous multiprocessors low-power optimization at algorithm-level

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now