Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Parallel Tree Algorithms for AMR and Non-Standard Data Access

Published: 07 November 2020 Publication History

Abstract

We introduce several parallel algorithms operating on a distributed forest of adaptive quadtrees/octrees. They are targeted at large-scale applications relying on data layouts that are more complex than required for standard finite elements, such as hp-adaptive Galerkin methods, particle tracking and semi-Lagrangian schemes, and in-situ post-processing and visualization. Specifically, we design algorithms to derive an adapted worker forest based on sparse data, to identify owner processes in a top-down search of remote objects, and to allow for variable process counts and per-element data sizes in partitioning and parallel file I/O. We demonstrate the algorithms’ usability and performance in the context of a particle tracking example that we scale to 21e9 particles and 64Ki MPI processes on the Juqueen supercomputer, and we describe the construction of a parallel assembly of variably sized spheres in space creating up to 768e9 elements on the Juwels supercomputer.

References

[1]
Mark James Abraham, Teemu Murtola, Roland Schulz, Szilárd Páll, Jeremy C. Smith, Berk Hess, and Erik Lindahl. 2015. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1--2 (2015), 19--25.
[2]
M. Adams, P. Colella, D. T. Graves, J. N. Johnson, H. S. Johansen, N. D. Keen, T. J. Ligocki, D. F. Martin, P. W. McCorquodale, D. Modiano, P. O. Schwartz, T. D. Sternberg, and B. Van Straalen. 2015. Chombo Software Package for AMR Applications -- Design Document. Technical Report. Lawrence Berkeley National Laboratory.
[3]
Mark Ainsworth and J. Tinsley Oden. 2000. A Posteriori Error Estimation in Finite Element Analysis. John Wiley 8 Sons.
[4]
Clelia Albrecht. 2016. Parallelization of Adaptive Gradient Augmented Level Set Methods. Master’s thesis. Rheinische Friedrich-Wilhelms-Universität Bonn.
[5]
Utkarsh Ayachit, Andrew Bauer, Berk Geveci, Patrick O’Leary, Kenneth Moreland, Nathan Fabian, and Jeffrey Mauldin. 2015. ParaView Catalyst: Enabling in situ data analysis and visualization. In Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization. ACM, 25--29.
[6]
I. Babuška and B. Q. Guo. 1992. The h, p and h-p version of the finite element method; basis theory and applications. Advances in Engineering Software 15, 3--4 (1992), 159--174.
[7]
Michael Bader, Christian Böck, Johannes Schwaiger, and Csaba Vigh. 2010. Dynamically adaptive simulations with minimal memory requirement—solving the shallow water equations with Sierpinski curves. SIAM Journal of Scientific Computing 32, 1 (2010), 212--228.
[8]
Wolfgang Bangerth, Ralf Hartmann, and Guido Kanschat. 2007. deal.II -- A general-purpose object-oriented finite element library. ACM Trans. Math. Software 33, 4 (2007), 24.
[9]
Carsten Burstedde. 2010. p4est: Parallel AMR on Forests of Octrees. http://www.p4est.org/ (last accessed June 3rd, 2019).
[10]
Carsten Burstedde. 2018. Distributed-Memory Forest-of-Octrees Raycasting. http://arxiv.org/abs/1809.01334.
[11]
Carsten Burstedde, Martin Burtscher, Omar Ghattas, Georg Stadler, Tiankai Tu, and Lucas C. Wilcox. 2009. ALPS: A framework for parallel adaptive PDE solution. Journal of Physics: Conference Series 180 (2009), 012009.
[12]
Carsten Burstedde, Omar Ghattas, Michael Gurnis, Tobin Isaac, Georg Stadler, Tim Warburton, and Lucas C. Wilcox. 2010. Extreme-Scale AMR. In SC10: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM/IEEE, 1--12.
[13]
Carsten Burstedde, Omar Ghattas, Georg Stadler, Tiankai Tu, and Lucas C. Wilcox. 2008. Towards adaptive mesh PDE simulations on petascale computers. In Proceedings of Teragrid’08.
[14]
Carsten Burstedde, Omar Ghattas, Georg Stadler, Tiankai Tu, and Lucas C. Wilcox. 2009. Parallel scalable adjoint-based adaptive solution for variable-viscosity Stokes flows. Computer Methods in Applied Mechanics and Engineering 198 (2009), 1691--1700.
[15]
Carsten Burstedde and Johannes Holke. 2016. A tetrahedral space-filling curve for nonconforming adaptive meshes. SIAM Journal on Scientific Computing 38, 5 (2016), C471--C503.
[16]
Carsten Burstedde and Johannes Holke. 2016. p4est: Scalable algorithms for parallel adaptive mesh refinement. In JUQUEEN Extreme Scaling Workshop 2016, Dirk Brömmel, Wolfgang Frings, and Brian J. N. Wylie (Eds.). Number FZJ-JSC-IB-2016-01 in JSC Internal Report. Jülich Supercomputing Centre, 49--54.
[17]
Carsten Burstedde, Lucas C. Wilcox, and Omar Ghattas. 2011. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM Journal on Scientific Computing 33, 3 (2011), 1103--1133.
[18]
Bernardo Cockburn, George E. Karniadakis, and Chi-Wang Shu. 2000. Discontinuous Galerkin Methods: Theory, Computation and Applications. Springer.
[19]
John M. Dawson. 1983. Particle simulation of plasmas. Reviews of Modern Physics 55, 2 (1983), 403.
[20]
John Drake, Ian Foster, John Michalakes, Brian Toonen, and Patrick Worley. 1995. Design and performance of a scalable parallel community climate model. Parallel Comput. 21, 10 (1995), 1571--1592.
[21]
Anshu Dubey, Cristopher Daley, John Arthur Zuhone, Paul M. Ricker, Klaus Weide, and Carlo Graziani. 2012. Imposing a Lagrangian particle framework on an Eulerian hydrodynamics infrastructure in FLASH. The Astrophysical Journal Supplement Series 201, 27 (2012).
[22]
Wolfgang Eckhardt, Alexander Heinecke, Reinhold Bader, Matthias Brehm, Nicolay Hammer, Herbert Huber, Hans-Georg Kleinhenz, Jadran Vrabec, Hans Hasse, Martin Horsch, Martin Bernreuther, Colin W. Glass, Christoph Niethammer, Arndt Bode, and Hans-Joachim Bungartz. 2013. 591 TFLOPS multi-trillion particles simulation on SuperMUC. In ISC 2013, J. M. Kunkel, T. Ludwig, and H. W. Meuer (Eds.). Lecture Notes in Computer Science, Vol. 7905. Springer.
[23]
Paul F. Fischer, Gerald W. Kruse, and Francis Loth. 2002. Spectral element methods for transitional flows in complex geometries. Journal of Scientific Computing 17, 1--4 (December 2002), 81--98.
[24]
James D. Foley, Andries van Dam, Steven K. Feiner, and John Hughes. 1990. Computer Graphics: Principles and Practice (2nd ed.). Addison-Wesley.
[25]
Rene Gassmoeller, Harsha Lokavarapu, Eric Heien, Elbridge Gerry Puckett, and Wolfgang Bangerth. 2018. Flexible and scalable particle-in-cell methods for geodynamic computations. Geochemistry, Geophysics, Geosystems 19 (9 2018), 3596--3604. Issue 9.
[26]
Robert Gerstenberger, Maciej Besta, and Torsten Hoefler. 2018. Enabling highly scalable remote memory access programming with MPI-3 one sided. Commun. ACM 61, 10 (2018), 106--113.
[27]
R. A. Gingold and Joseph J. Monaghan. 1977. Smoothed particle hydrodynamics: Theory and application to non-spherical stars. Monthly Notes of the Royal Astronomical Society 181 (1977), 375--389.
[28]
Jens Glaser, Trung Dac Nguyen, Joshua A. Anderson, Pak Lui, Filippo Spiga, Jaime A. Millan, David C. Morse, and Sharon C. Glotzer. 2015. Strong scaling of general-purpose molecular dynamics simulations on GPUs. Computer Physics Communications 192 (2015), 97--107.
[29]
Michael Griebel and Gerhard W. Zumbusch. 2000. Parallel adaptive subspace correction schemes with applications to elasticity. Computer Methods in Applied Mechanics and Engineering 184 (2000), 303--332.
[30]
Arthur Guittet, Tobin Isaac, Carsten Burstedde, and Frédéric Gibou. 2016. Direct Numerical Simulation of Incompressible Flows on Parallel Octree Grids. (2016). Manuscript.
[31]
Francis H. Harlow. 1964. The particle-in-cell computing method for fluid dynamics. Methods in Computational Physics 3 (1964), 319--343.
[32]
Francis H. Harlow and J. Eddie Welch. 1965. Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface. Physics of Fluids 8, 12 (1965), 2182--2189.
[33]
Lars Hernquist and Neil Katz. 1989. A unification of SPH with the hierarchical tree method. The Astrophysical Journal Supplement Series 70 (1989), 419--446.
[34]
David Hilbert. 1891. Über die stetige Abbildung einer Linie auf ein Flächenstück. Math. Ann. 38 (1891), 459--460.
[35]
Torsten Hoefler, James Dinan, Rajeev Thakur, Brian Barrett, Pavan Balaji, William Gropp, and Keith Underwood. 2015. Remote memory access programming in MPI-3. ACM Transactions on Parallel Computing 2, 2 (2015), 1--26.
[36]
Cullan Howlett, Marc Manera, and Will J. Percival. 2015. L-PICOLA: A parallel code for fast dark matter simulation. Astronomy and Computing 12 (2015), 109--126.
[37]
Tobin Isaac, Carsten Burstedde, and Omar Ghattas. 2012. Low-cost parallel algorithms for 2:1 octree balance. In Proceedings of the 26th IEEE International Parallel 8 Distributed Processing Symposium, Vol. 1. IEEE, 426--437.
[38]
Tobin Isaac, Carsten Burstedde, Lucas C. Wilcox, and Omar Ghattas. 2015. Recursive algorithms for distributed forests of octrees. SIAM Journal on Scientific Computing 37, 5 (2015), C497--C531. [cs.DC]
[39]
Jülich Supercomputing Centre. 2015. JUQUEEN: IBM Blue Gene/Q supercomputer system at the Jülich Supercomputing Centre. Journal of Large-scale Research Facilities 1 (2015), A1.
[40]
Jülich Supercomputing Centre. 2019. JUWELS: Modular tier-0/1 supercomputer at the Jülich Supercomputing Centre. Journal of Large-scale Research Facilities 5 (2019), A135.
[41]
Randall J. LeVeque. 2002. Finite Volume Methods for Hyperbolic Problems. Cambridge University Press.
[42]
Shigang Li, Yenuan Zhang, and Torsten Hoefler. 2018. Cache-oblivious MPI all-to-all communications based on Morton Order. IEEE Transactions on Parallel and Distributed Systems 29, 3 (3 2018), 542--555.
[43]
Stephen L. W. McMillan and Sverre J. Aarseth. 1993. An O(N log N) integration scheme for collisional stellar systems. The Astrophysical Journal 414 (1993), 200--212.
[44]
Mohammad Mirzadeh, Arthur Guittet, Carsten Burstedde, and Frédéric Gibou. 2016. Parallel level-set methods on adaptive tree-based grids. J. Comput. Phys. 322 (2016), 345--364.
[45]
L. Moresi, F. Dufour, and H.-B. Mühlhaus. 2003. A Lagrangian integration point finite element method for large deformation modeling of viscoelastic geomaterials. J. Comput. Phys. 184 (2003), 476--497.
[46]
G. M. Morton. 1966. A Computer Oriented Geodetic Data Base; and a New Technique in File Sequencing. Technical Report. IBM Ltd.
[47]
Franco P. Preparata and Michael Shamos. 1985. Computational Geometry. An Introduction. Springer.
[48]
Antonio Ragagnin, Nikola Tchipev, Michael Bader, Klaus Dolag, and Nicolay Hammer. 2016. Exploiting the space filling curve ordering of particles in the neighbour seach of Gadget3. In Parallel Computing: On the Road to Exascale, G. R. Joubert et.al. (Ed.). IOS Press, 411--420.
[49]
Abtin Rahimian, Ilya Lashuk, Shravan Veerapaneni, Aparna Chandramowlishwaran, Dhairya Malhotra, Logan Moon, Rahul Sampath, Aashay Shringarpure, Jeffrey Vetter, Richard Vuduc, et al. 2010. Petascale direct numerical simulation of blood flow on 200K cores and heterogeneous architectures. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, ACM, 1--11.
[50]
Johann Rudi, A. Cristiano I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar, Yves Ineichen, Costas Bekas, Alessandro Curioni, and Omar Ghattas. 2015. An extreme-scale implicit solver for complex PDEs: Highly heterogeneous flow in earth’s mantle. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 1--12.
[51]
James R. Stewart and H. Carter Edwards. 2004. A framework approach for developing parallel adaptive multiphysics applications. Finite Elements in Analysis and Design 40, 12 (2004), 1599--1617.
[52]
G. Strang and G. J. Fix. 1988. An Analysis of the Finite Element Method. Wellesley-Cambridge Press.
[53]
E. Suli, C. Schwab, and P. Houston. 2000. -DGFEM for partial differential equations with nonnegative characteristic form. In Discontinuous Galerkin Methods. Theory, Computation and Applications, B. Cockburn, G. E. Karniadakis, and C. W. Shu (Eds.). Springer, 221--230.
[54]
Hari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, and Georg Stadler. 2012. Parallel geometric-algebraic multigrid on unstructured forests of octrees. In SC12: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM/IEEE, 1--11.
[55]
Hari Sundar, Rahul Sampath, and George Biros. 2008. Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM Journal on Scientific Computing 30, 5 (2008), 2675--2708.
[56]
Tiankai Tu, David R. O’Hallaron, and Omar Ghattas. 2005. Scalable parallel octree meshing for TeraScale applications. In SC’05: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE, 4--4.
[57]
Dong Wang, Yisong Zhou, and Sihong Shao. 2016. Efficient implementation of smoothed particle hydrodynamics (SPH) with plane sweep algorithm. Communications in Computational Physics 19 (2016), 770--800. Issue 3.

Cited By

View all
  • (2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
  • (2023)Scalable recovery-based adaptation on Cartesian quadtree meshes for advection-diffusion-reaction problemsAdvances in Computational Science and Engineering10.3934/acse.2023018(0-0)Online publication date: 2023
  • (2023)The deal.II Library, Version 9.5Journal of Numerical Mathematics10.1515/jnma-2023-008931:3(231-246)Online publication date: 22-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software
ACM Transactions on Mathematical Software  Volume 46, Issue 4
December 2020
272 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/3430683
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2020
Accepted: 01 May 2020
Revised: 01 January 2020
Received: 01 May 2018
Published in TOMS Volume 46, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Adaptive mesh refinement
  2. forest of octrees
  3. particle tracking

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Bonn Hausdorff Center for Mathematics (HCM)
  • Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Initiative EXC

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)5
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
  • (2023)Scalable recovery-based adaptation on Cartesian quadtree meshes for advection-diffusion-reaction problemsAdvances in Computational Science and Engineering10.3934/acse.2023018(0-0)Online publication date: 2023
  • (2023)The deal.II Library, Version 9.5Journal of Numerical Mathematics10.1515/jnma-2023-008931:3(231-246)Online publication date: 22-Aug-2023
  • (2023)Algorithms for Parallel Generic hp-Adaptive Finite Element SoftwareACM Transactions on Mathematical Software10.1145/360337249:3(1-26)Online publication date: 5-Jun-2023
  • (2023)Parallel simulations for fast‐moving landslides: Space‐time mesh adaptation and sharp tracking of the wetting frontInternational Journal for Numerical Methods in Fluids10.1002/fld.518695:8(1286-1309)Online publication date: 23-Mar-2023
  • (2022)The deal.II library, Version 9.4Journal of Numerical Mathematics10.1515/jnma-2022-005430:3(231-246)Online publication date: 17-Jul-2022
  • (2022)Lethe-DEM: an open-source parallel discrete element solver with load balancingComputational Particle Mechanics10.1007/s40571-022-00478-610:1(77-96)Online publication date: 20-May-2022
  • (2021)An Optimized, Parallel Computation of the Ghost Layer for Adaptive Hybrid Forest MeshesSIAM Journal on Scientific Computing10.1137/20M138303343:6(C359-C385)Online publication date: 1-Jan-2021
  • (2021)Distributed Domain Generation for Large-Scale Scientific Computing2021 20th International Symposium on Parallel and Distributed Computing (ISPDC)10.1109/ISPDC52870.2021.9521644(105-113)Online publication date: 28-Jul-2021

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media