article

Effective out-of-core parallel delaunay mesh refinement using off-the-shelf software

Authors:

Andrey N. Chernikov,

Nikos P. ChrisochoidesAuthors Info & Claims

Journal of Experimental Algorithmics (JEA), Volume 16

Article No.: 1.5, Pages 1.1 - 1.22

https://doi.org/10.1145/1963190.2019580

Published: 26 August 2011 Publication History

Abstract

We present three related out-of-core parallel mesh generation algorithms and their implementations for small size computational clusters. Computing out-of-core permits to solve larger problems than otherwise possible on the same hardware setup. Also, when using shared computing resources with high demand, a problem can take longer to compute in terms of wall-clock time when using an in-core algorithm on many nodes instead of using an out-of-core algorithm on few nodes. The difference is due to wait-in-queue delays that can grow exponentially to the number of requested nodes. In one specific case, using our best method and only 16 nodes it can take several times less wall-clock time to generate a 2 billion element mesh than to generate the same size mesh in-core with 121 nodes.

Although our best out-of-core method exhibits unavoidable overheads (could be as low as 19% in some cases) over the corresponding in-core method (for mesh sizes that fit completely in-core), this is a modest and expected performance penalty. We evaluated our methods on traditional clusters of workstations as well as presented preliminary performance evaluation on emerging BlueWaters supercomputer.

References

[1]

Aggarwal, A. and Vitter, Jeffrey, S. 1988. The input/output complexity of sorting and related problems. Comm. ACM 31, 9, 1116--1127.

Digital Library

[2]

Barker, K., Chernikov, A., Chrisochoides, N., and Pingali, K. 2004. A load balancing framework for adaptive and asynchronous applications. IEEE Trans. Parall. Distrib. Syst. 15, 2, 183--192.

Digital Library

[3]

Blelloch, G. E., Hardwick, J., Miller, G. L., and Talmor, D. 1999. Design and implementation of a practical parallel Delaunay algorithm. Algorithmica 24, 243--269.

[4]

Chernikov, A. and Chrisochoides, N. 2006a. Generalized delaunay mesh refinement: From scalar to parallel. In Proceedings of the 15th International Meshing Roundtable. Springer, Berlin, 563--580.

[5]

Chernikov, A. and Chrisochoides, N. 2006b. Parallel guaranteed quality delaunay uniform mesh refinement. SIAM J. Sci. Comput. 28, 1907--1926.

Digital Library

[6]

Chernikov, A. and Chrisochoides, N. 2008a. Algorithm 872: Parallel 2d constrained delaunay mesh generation. ACM Trans. Math. Softw. 34, 6--25.

Digital Library

[7]

Chernikov, A. and Chrisochoides, N. 2008b. Three-dimensional delaunay refinement for multi-core processors. In Proceedings of the 22nd ACM International Conference on Supercomputing. ACM, New York, 214--224.

Digital Library

[8]

Chernikov, A. N. and Chrisochoides, N. P. 2004. Practical and efficient point insertion scheduling method for parallel guaranteed quality Delaunay refinement. In Proceedings of the 18th International Conference on Supercomputing. ACM, New York, 48--57.

Digital Library

[9]

Chernikov, A. N. and Chrisochoides, N. P. 2009. Generalized two-dimensional Delaunay mesh refinement. SIAM J. Sci. Comput. 31, 5, 3387--3403.

Digital Library

[10]

Choi, J., Dongarra, J., Pozo, R., and Walker, D. 1992. Scalapack: A scalable linear algebra for distributed memory concurrent computers. In Proceedings of the 4th Symposium on the Frontiers of Massively Parallel Computation. IEEE, Los Alamitos, CA, 120--127.

[11]

Chrisochoides, N. 2005. Parallel mesh generation. In Numerical Solution of Partial Differential Equations on Parallel Computers 51, Eds. M. Bruaset, P. Bjorstad, A. Tveito.

[12]

D'Azevedo, E. F. and Dongarra, J. 2000. The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines. Concurrency Pract. Exp. 12, 15, 1481--1493.

[13]

Dehne, F., Dittrich, W., and Hutchinson, D. 1997. Efficient external memory algorithms by simulating coarse-grained parallel algorithms. In Proceedings of the 9th ACM Symposium on Parallel Algorithms and Architectures. ACM, New York, 106--115.

Digital Library

[14]

Demmel, J., Dongarra, J., Croz, J. D., Greenbaum, A., Hammarling, S., and Sorensen, D. 1987. Prospectus for the development of a linear algebra library for high-performance computers. Tech. rep. ANL/MCS-TM-97. Argonne, IL.

[15]

Dong, S., Lucor, D., and Karniadakis, G. E. 2004. Flow past a stationary and moving cylinder: DNS at Re = 10,000. In Proceedings of the Users Group Conference (DOD_UGC'04). IEEE, Los Alamitos, CA, 88--95.

Digital Library

[16]

Dongarra, J. J., Du Croz, J., Hammarling, S., and Hanson, R. J. 1988. An extended set of FORTRAN Basic Linear Algebra Subprograms. ACM Trans. Math. Softw. 14, 1, 1--17.

Digital Library

[17]

Foteinos, P. A., Chernikov, A. N., and Chrisochoides, N. P. 2010. Fully generalized two-dimensional constrained Delaunay mesh refinement. SIAM J. Sci. Comput. 32, 5, 2659--2686.

Digital Library

[18]

Karniadakis, G. and Orszag, S. 1993. Nodes, modes, and flow codes. Physics Today 46, 34--42.

[19]

Kot, A., Chernikov, A., and Chrisochoides, N. 2005. Out-of-core parallel delaunay mesh generation for shared memory machines. In Proceedings of the 17th IMACS World Congress Scientific Computation, Applied Mathematics and Simulation.

[20]

Kot, A., Chernikov, A., and Chrisochoides, N. 2006. Effective out-of-core parallel delaunay mesh refinement using off-the-shelf software. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium. IEEE, Los Alamitos, CA, 104--114.

Digital Library

[21]

Linardakis, L. and Chrisochoides, N. 2008a. Algorithm 870: A static geometric medial axis domain decomposition in 2d euclidean space. ACM Trans. Math. Softw. 34, 4:1--4:28.

Digital Library

[22]

Linardakis, L. and Chrisochoides, N. 2008b. Graded delaunay decoupling method for parallel guaranteed quality planar mesh generation. SIAM J. Sci. Comput. 30, 1875--1891.

Digital Library

[23]

Nave, D., Chrisochoides, N., and Chew, L. P. 2002. Guaranteed: quality parallel Delaunay refinement for restricted polyhedral domains. In Proceedings of the 18th Annual Symposium on Computational Geometry (SCG'02). ACM, New York, 135--144.

Digital Library

[24]

Nodine, M. H. and Vitter, J. S. 1995. Greed sort: optimal deterministic sorting on parallel disks. J. ACM 42, 4, 919--933.

Digital Library

[25]

Ruppert, J. 1995. A Delaunay refinement algorithm for quality 2-dimensional mesh generation. J. Algorithms 18, 3, 548--585.

Digital Library

[26]

Salmon, J. and Warren, M. 1997. Parallel out-of-core methods for N-body simulation. In Proceedings of the 8th SIAM Conference on Parallel Processing for Scientific Computing. SIAM, Philadelphia, PA.

[27]

Shewchuk, J. R. 1996. Triangle: Engineering a 2D quality mesh generator and delaunay triangulator. In Applied Computational Geometry: Towards Geometric Engineering, M. C. Lin and D. Manocha, Eds., Lecture Notes in Computer Science Series, vol. 1148. Springer-Verlag, Berlin, 203--222.

Digital Library

[28]

Shewchuk, J. R. 1997. Delaunay refinement mesh generation. Ph.D. thesis, Carnegie Mellon University.

Digital Library

[29]

Shewchuk, J. R. 2002. Delaunay refinement algorithms for triangular mesh generation. Comput. Geom. Theory Appl. 22, 1--3, 21--74.

Digital Library

[30]

Toledo, S. and Gustavson, F. 1996. The design and implementation of solar, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Annual Workshop on I/O in Parallel and Distributed Systems. ACM, New York, 28--40.

Digital Library

[31]

Tu, T. and O'Hallaron, D. R. 2004. A computational database system for generating unstructured hexahedral meshes with billions of elements. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC'04). IEEE, Los Alamitos, CA, 25--39.

Digital Library

[32]

Tu, T. and O'Hallaron, D. R. 2005. Extracting hexahedral mesh structures from balanced linear octrees. In Proceedings of the 13th International Meshing Roundtable. Springer, Berlin, 191--200.

[33]

Vitter, J. S. and Nodine, M. H. 1993. Large-scale sorting in uniform memory hierarchies. J. Parall. Distrib. Comput. 17, 107--114.

Digital Library

[34]

Vitter, J. S. and Shriver, E. A. M. 1993. Algorithms for parallel memory ii: Hierarchical multilevel memories. Algorithmica 12, 148--169.

Digital Library

[35]

Vitter, J. S., Shriver, E. A. M., and Z, E. A. M. S. 1994. Algorithms for parallel memory i: Two-level memories. Algorithmica 12, 110--147.

Digital Library

[36]

Walters, R. A. 2005. Coastal Ocean Models: Two useful finite element methods. Cont. Shelf Res. 25, 775--793.

Index Terms

Effective out-of-core parallel delaunay mesh refinement using off-the-shelf software
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms
      1. Nonalgebraic algorithms
2. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

Effective out-of-core parallel delaunay mesh refinement using off-the-shelf software
IPDPS'06: Proceedings of the 20th international conference on Parallel and distributed processing

We present two cost-effective and high-performance out-of-core parallel mesh generation algorithms and their implementation on Cluster of Workstations (CoWs). The total wall-clock time including wait-in-queue delays for the out-of-core methods on a ...
TetGen, a Delaunay-Based Quality Tetrahedral Mesh Generator

TetGen is a C++ program for generating good quality tetrahedral meshes aimed to support numerical methods and scientific computing. The problem of quality tetrahedral mesh generation is challenged by many theoretical and practical issues. TetGen uses ...
Parallel delaunay refinement mesh generation

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics

ACM Journal of Experimental Algorithmics Volume 16, Issue

2011

411 pages

ISSN:1084-6654

EISSN:1084-6654

DOI:10.1145/1963190

Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 August 2011

Published in JEA Volume 16

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
216
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents