Parallel Assembly of ACA BEM Matrices on Xeon Phi Clusters

Michal Kravcenko^17,18,
Lukas Maly^17,18,
Michal Merta^17,18 &
…
Jan Zapletal^17,18

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10777))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

1579 Accesses
2 Citations

Abstract

The paper presents parallelization of the boundary element method in distributed memory of a cluster equipped with many-core based compute nodes. A method for efficient distribution of boundary element matrices among MPI processes based on the cyclic graph decompositions is described. In addition, we focus on the intra-node optimization of the code, which is necessary in order to fully utilize the many-core processors with wide SIMD registers. Numerical experiments carried out on a cluster consisting of the Intel Xeon Phi processors of the Knights Landing generation are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An explicit asynchronous step parallel computing method for finite element analysis on multi-core clusters

Article 26 February 2019

Efficient parallelization of multilevel fast multipole algorithm for electromagnetic simulation on many-core SW26010 processor

Article 19 May 2020

Assembly Operations for Multicore Architectures Using Task-Based Runtime Systems

Notes

1.
November 2016 version.

References

Bebendorf, M.: Approximation of boundary element matrices. Numer. Math. 86(4), 565–589 (2000)
Article MathSciNet MATH Google Scholar
Bebendorf, M., Kriemann, R.: Fast parallel solution of boundary integral equations and related problems. Comp. Vis. Sci. 8(3–4), 121–135 (2005)
Article MathSciNet Google Scholar
Bebendorf, M., Rjasanow, S.: Adaptive low-rank approximation of collocation matrices. Computing 70(1), 1–24 (2003)
Article MathSciNet MATH Google Scholar
Börm, S.: H2Lib (2017). http://www.h2lib.org/. Accessed 14 Feb 2017
Dongarra, J.: Report on the Sunway TaihuLight system. Technical report. University of Tennessee, Oak Ridge National Laboratory, June 2016
Google Scholar
Karypis, G., Kumar, V.: A fast and highly quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1999)
Article MATH Google Scholar
Kravcenko, M., Merta, M., Zapletal, J.: Using discrete mathematics to optimize parallelism in boundary element method, Paper 2. In: Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering. Civil-Comp Press, Stirlingshire (2017). https://doi.org/10.4203/ccp.111.2
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., Bishop, A.R.: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM J. Sci. Comput. 36(5), C401–C423 (2014)
Article MathSciNet MATH Google Scholar
Lukáš, D., Kovář, P., Kovářová, T., Merta, M.: A parallel fast boundary element method using cyclic graph decompositions. Numer. Algorithms 70(4), 807–824 (2015)
Article MathSciNet MATH Google Scholar
Merta, M., Zapletal, J.: BEM4I (2014). http://bem4i.it4i.cz. Accessed 17 Jan 2017
Merta, M., Zapletal, J., Jaros, J.: Many core acceleration of the boundary element method. In: Kozubek, T., Blaheta, R., Šístek, J., Rozložník, M., Čermák, M. (eds.) HPCSE 2015. LNCS, vol. 9611, pp. 116–125. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40361-8_8
Chapter Google Scholar
Merta, M., Riha, L., Meca, O., Markopoulos, A., Brzobohaty, T., Kozubek, T., Vondrak, V.: Intel Xeon Phi acceleration of hybrid total FETI solver. Adv. Eng. Softw. 112, 124–135 (2017)
Article Google Scholar
Říha, L., Brzobohatý, T., Markopoulos, A., Kozubek, T., Meca, O., Schenk, O., Vanroose, W.: Efficient implementation of total FETI solver for graphic processing units using schur complement. In: Kozubek, T., Blaheta, R., Šístek, J., Rozložník, M., Čermák, M. (eds.) HPCSE 2015. LNCS, vol. 9611, pp. 85–100. Springer, Cham (2016)
Chapter Google Scholar
Rjasanow, S., Steinbach, O.: The Fast Solution of Boundary Integral Equations. Springer, Boston (2007). https://doi.org/10.1007/0-387-34042-4
MATH Google Scholar
Sauter, S.A., Schwab, C.: Boundary element methods. In: Sauter, S.A., Schwab, C. (eds.) Boundary Element Methods. Springer Series in Computational Mathematics, vol. 39, pp. 183–287. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-540-68093-2_4
Chapter Google Scholar
Steinbach, O.: Numerical Approximation Methods for Elliptic Boundary Value Problems: Finite and Boundary Elements. Texts in Applied Mathematics. Springer, New York (2008). https://doi.org/10.1007/978-0-387-68805-3
Book MATH Google Scholar
Zapletal, J., Merta, M., Malý, L.: Boundary element quadrature schemes for multi-and many-core architectures. Comput. Math. Appl. 74(1), 157–173 (2017). 5th European Seminar on Computing ESCO 2016
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by The Ministry of Educations, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project “IT4Innovations National Supercomputing Center – LM2015070”. The work was supported by The Ministry of Educations, Youth and Sports from the National Programme of Sustainability (NPU II) project “IT4Innovations excellence in science – LQ1602”. This work was partially supported by grant of SGS No. SP2017/165 “Efficient implementation of the boundary element method III”, VŠB – Technical University of Ostrava, Czech Republic. The authors thank HLRN for providing us with access to the HLRN Berlin Test and Development System.

Author information

Authors and Affiliations

IT4Innovations, VŠB – Technical University of Ostrava, 17. listopadu 15/2172, 708 33, Ostrava-Poruba, Czech Republic
Michal Kravcenko, Lukas Maly, Michal Merta & Jan Zapletal
Department of Applied Mathematics, VŠB – Technical University of Ostrava, 17. listopadu 15/2172, 708 33, Ostrava-Poruba, Czech Republic
Michal Kravcenko, Lukas Maly, Michal Merta & Jan Zapletal

Authors

Michal Kravcenko
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Maly
View author publications
You can also search for this author in PubMed Google Scholar
Michal Merta
View author publications
You can also search for this author in PubMed Google Scholar
Jan Zapletal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michal Merta .

Editor information

Editors and Affiliations

Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
University of Southern California, Marina Del Rey, California, USA
Ewa Deelman
Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kravcenko, M., Maly, L., Merta, M., Zapletal, J. (2018). Parallel Assembly of ACA BEM Matrices on Xeon Phi Clusters. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10777. Springer, Cham. https://doi.org/10.1007/978-3-319-78024-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-78024-5_10
Published: 23 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78023-8
Online ISBN: 978-3-319-78024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Parallel Assembly of ACA BEM Matrices on Xeon Phi Clusters

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An explicit asynchronous step parallel computing method for finite element analysis on multi-core clusters

Efficient parallelization of multilevel fast multipole algorithm for electromagnetic simulation on many-core SW26010 processor

Assembly Operations for Multicore Architectures Using Task-Based Runtime Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Parallel Assembly of ACA BEM Matrices on Xeon Phi Clusters

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An explicit asynchronous step parallel computing method for finite element analysis on multi-core clusters

Efficient parallelization of multilevel fast multipole algorithm for electromagnetic simulation on many-core SW26010 processor

Assembly Operations for Multicore Architectures Using Task-Based Runtime Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation