Abstract
The paper presents parallelization of the boundary element method in distributed memory of a cluster equipped with many-core based compute nodes. A method for efficient distribution of boundary element matrices among MPI processes based on the cyclic graph decompositions is described. In addition, we focus on the intra-node optimization of the code, which is necessary in order to fully utilize the many-core processors with wide SIMD registers. Numerical experiments carried out on a cluster consisting of the Intel Xeon Phi processors of the Knights Landing generation are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
November 2016 version.
References
Bebendorf, M.: Approximation of boundary element matrices. Numer. Math. 86(4), 565–589 (2000)
Bebendorf, M., Kriemann, R.: Fast parallel solution of boundary integral equations and related problems. Comp. Vis. Sci. 8(3–4), 121–135 (2005)
Bebendorf, M., Rjasanow, S.: Adaptive low-rank approximation of collocation matrices. Computing 70(1), 1–24 (2003)
Börm, S.: H2Lib (2017). http://www.h2lib.org/. Accessed 14 Feb 2017
Dongarra, J.: Report on the Sunway TaihuLight system. Technical report. University of Tennessee, Oak Ridge National Laboratory, June 2016
Karypis, G., Kumar, V.: A fast and highly quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1999)
Kravcenko, M., Merta, M., Zapletal, J.: Using discrete mathematics to optimize parallelism in boundary element method, Paper 2. In: Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering. Civil-Comp Press, Stirlingshire (2017). https://doi.org/10.4203/ccp.111.2
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., Bishop, A.R.: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM J. Sci. Comput. 36(5), C401–C423 (2014)
Lukáš, D., Kovář, P., Kovářová, T., Merta, M.: A parallel fast boundary element method using cyclic graph decompositions. Numer. Algorithms 70(4), 807–824 (2015)
Merta, M., Zapletal, J.: BEM4I (2014). http://bem4i.it4i.cz. Accessed 17 Jan 2017
Merta, M., Zapletal, J., Jaros, J.: Many core acceleration of the boundary element method. In: Kozubek, T., Blaheta, R., Šístek, J., Rozložník, M., Čermák, M. (eds.) HPCSE 2015. LNCS, vol. 9611, pp. 116–125. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40361-8_8
Merta, M., Riha, L., Meca, O., Markopoulos, A., Brzobohaty, T., Kozubek, T., Vondrak, V.: Intel Xeon Phi acceleration of hybrid total FETI solver. Adv. Eng. Softw. 112, 124–135 (2017)
Říha, L., Brzobohatý, T., Markopoulos, A., Kozubek, T., Meca, O., Schenk, O., Vanroose, W.: Efficient implementation of total FETI solver for graphic processing units using schur complement. In: Kozubek, T., Blaheta, R., Šístek, J., Rozložník, M., Čermák, M. (eds.) HPCSE 2015. LNCS, vol. 9611, pp. 85–100. Springer, Cham (2016)
Rjasanow, S., Steinbach, O.: The Fast Solution of Boundary Integral Equations. Springer, Boston (2007). https://doi.org/10.1007/0-387-34042-4
Sauter, S.A., Schwab, C.: Boundary element methods. In: Sauter, S.A., Schwab, C. (eds.) Boundary Element Methods. Springer Series in Computational Mathematics, vol. 39, pp. 183–287. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-540-68093-2_4
Steinbach, O.: Numerical Approximation Methods for Elliptic Boundary Value Problems: Finite and Boundary Elements. Texts in Applied Mathematics. Springer, New York (2008). https://doi.org/10.1007/978-0-387-68805-3
Zapletal, J., Merta, M., Malý, L.: Boundary element quadrature schemes for multi-and many-core architectures. Comput. Math. Appl. 74(1), 157–173 (2017). 5th European Seminar on Computing ESCO 2016
Acknowledgements
This work was supported by The Ministry of Educations, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project “IT4Innovations National Supercomputing Center – LM2015070”. The work was supported by The Ministry of Educations, Youth and Sports from the National Programme of Sustainability (NPU II) project “IT4Innovations excellence in science – LQ1602”. This work was partially supported by grant of SGS No. SP2017/165 “Efficient implementation of the boundary element method III”, VŠB – Technical University of Ostrava, Czech Republic. The authors thank HLRN for providing us with access to the HLRN Berlin Test and Development System.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kravcenko, M., Maly, L., Merta, M., Zapletal, J. (2018). Parallel Assembly of ACA BEM Matrices on Xeon Phi Clusters. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10777. Springer, Cham. https://doi.org/10.1007/978-3-319-78024-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-78024-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78023-8
Online ISBN: 978-3-319-78024-5
eBook Packages: Computer ScienceComputer Science (R0)