Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/SC.2018.00027acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Topology-aware space-shared co-analysis of large-scale molecular dynamics simulations

Published: 26 July 2019 Publication History

Abstract

Analysis of scientific simulation data can be concurrently executed with simulation either in time- or space-shared mode. This mitigates the I/O bottleneck, however it results in either stalling the simulation for performing the analysis or transferring data for analysis. In this paper, we improve the throughput of space-shared in situ analysis of large-scale simulations by topology-aware mapping and optimal process decomposition. We propose node interconnect topology-aware process placement for simulation and analysis to reduce the data movement time. We also present an integer linear program for optimal 3D decompositions of simulation and analysis processes. We demonstrate our approach using molecular dynamics simulation on Mira, Cori and Theta supercomputers. Our mapping schemes, combined with optimal 3D process decomposition and code optimizations resulted in up to 30% lower execution times for space-shared in situ analysis than the default approach. Our mappings also reduce MPI collective I/O times by 10--40%.

References

[1]
L. Grinberg, V. Morozov, D. Fedosov, J. Insley, M. Papka, K. Kumaran, and G. Karniadakis, "A new computational paradigm in multiscale simulations: Application to brain blood flow," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2011.
[2]
S. Habib, V. Morozov, H. Finkel, A. Pope, K. Heitmann, K. Kumaran, T. Peterka, J. Insley, D. Daniel, P. Fasel, N. Frontiere, and Z. Lukic, "The Universe at Extreme Scale: Multi-petaflop Sky Simulation on the BG/Q," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2012.
[3]
P. Johnsen, M. Straka, M. Shapiro, A. Norton, and T. Galarneau, "Petascale WRF Simulation of Hurricane Sandy Deployment of NCSA's Cray XE6 Blue Waters," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2013.
[4]
T. Ichimura, K. Fujita, S. Tanaka, M. Hori, M. Lalith, Y. Shizawa, and H. Kobayashi, "Physics-based Urban Earthquake Simulation Enhanced by 10.7 BlnDOF X 30 K Time-step Unstructured FE Non-linear Seismic Wave Simulation," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2014.
[5]
A. Randles, E. W. Draeger, T. Oppelstrup, L. Krauss, and J. A. Gunnels, "Massively Parallel Models of the Human Circulatory System," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '15, 2015.
[6]
Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, "Top Ten Exascale Research Challenges," US Department of Energy, Office of Science, Tech. Rep., 2014.
[7]
J. Jeffers, J. Reinders, and A. Sodani, Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition. Elsevier Science, 2016.
[8]
F. Chen, M. Flatken, A. Basermann, A. Gerndt, J. Hetheringthon, T. Krüger, G. Matura, and R. W. Nash, "Enabling In Situ Pre- and Post-processing for Exascale Hemodynamic Simulations - A Co-design Study with the Sparse Geometry Lattice-Boltzmann Code HemeLB," in 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, Nov 2012, pp. 662--668.
[9]
P. O'Leary, J. Ahrens, S. Jourdain, S. Wittenburg, D. H. Rogers, and M. Petersen, "Cinema image-based in situ analysis and visualization of MPAS-ocean simulations," Parallel Computing, vol. 55, pp. 43--48, 2016, visualization and Data Analytics for Scientific Discovery.
[10]
A. C. Bauer, H. Abbasi, J. Ahrens, H. Childs, B. Geveci, S. Klasky, K. Moreland, P. O'Leary, V. Vishwanath, B. Whitlock, and E. W. Bethel, "In Situ Methods, Infrastructures, and Applications on High Performance Computing Platforms," Computer Graphics Forum, vol. 35, no. 3, pp. 577--597, 2016.
[11]
C. Seshadhri, A. Pinar, D. Thompson, and J. Bennett, "Sublinear Algorithms for Extreme-Scale Data Analysis," in Topological and Statistical Methods for Complex Data, ser. Mathematics and Visualization, J. Bennett, F. Vivodtzev, and V. Pascucci, Eds. Springer Berlin Heidelberg, 2015, pp. 39--54.
[12]
J. Bennett, H. Abbasi, P.-T. Bremer, R. Grout, A. Gyulassy, T. Jin, S. Klasky, H. Kolla, M. Parashar, V. Pascucci, P. Pebay, D. Thompson, H. Yu, F. Zhang, and J. Chen, "Combining in-situ and in-transit processing to enable extreme-scale scientific analysis," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2012.
[13]
Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, "Synergistic Challenges in Data-Intensive Science and Exascale Computing," US Department of Energy, Office of Science, Tech. Rep., 2013.
[14]
X. Ma, M. Winslett, J. Lee, and S. Yu, "Improving MPI-IO output performance with active buffering plus threads," in Proceedings International Parallel and Distributed Processing Symposium, April 2003.
[15]
K. L. Ma, "In Situ Visualization at Extreme Scale: Challenges and Opportunities," IEEE Computer Graphics and Applications, vol. 29, no. 6, pp. 14--19, Nov 2009.
[16]
J. Kress, S. Klasky, N. Podhorszki, J. Choi, H. Childs, and D. Pugmire, "Loosely Coupled In Situ Visualization: A Perspective on Why It's Here to Stay," in Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, ser. ISAV2015, 2015.
[17]
Y. Wang, G. Agrawal, T. Bicer, and W. Jiang, "Smart: A MapReduce-like Framework for In-situ Scientific Analytics," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '15, 2015.
[18]
M. Dorier, R. Sisneros, T. Peterka, G. Antoniu, and D. Semeraro, "Damaris/Viz: A nonintrusive, adaptable and user-friendly in situ visualization framework," in 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV), Oct 2013.
[19]
P. Malakar, V. Vishwanath, C. Knight, T. Munson, and M. E. Papka, "Optimal execution of co-analysis for large-scale molecular dynamics simulations," in SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016, pp. 702--715.
[20]
B. Friesen, A. Almgren, Z. Lukić, G. Weber, D. Morozov, V. Beckner, and M. Day, "In situ and in-transit analysis of cosmological simulations," Computational Astrophysics and Cosmology, vol. 3, no. 1, 2016.
[21]
C. Sewell, K. Heitmann, H. Finkel, G. Zagaris, S. T. Parete-Koon, P. K. Fasel, A. Pope, N. Frontiere, L.-t. Lo, B. Messer, S. Habib, and J. Ahrens, "Large-scale Compute-intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '15, 2015.
[22]
M. Dreher and B. Raffin, "A Flexible Framework for Asynchronous in Situ and in Transit Analytics for Scientific Simulations," in Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, May 2014, pp. 277--286.
[23]
"Top 500 Supercomputing Sites," http://www.top500.org.
[24]
"Argonne Leadership Computing Facility's Supercomputer Theta," http://www.alcf.anl.gov/theta.
[25]
"NERSC, Lawrence Berkeley National Laboratory's Supercomputer Cori," http://www.nersc.gov/users/computational-systems/cori.
[26]
A. Varghese, B. Edwards, G. Mitra, and A. P. Rendell, "Programming the Adapteva Epiphany 64-core network-on-chip coprocessor," The International Journal of High Performance Computing Applications, vol. 31, no. 4, pp. 285--302, 2017.
[27]
M. Li, S. S. Vazhkudai, A. R. Butt, F. Meng, X. Ma, Y. Kim, C. Engelmann, and G. Shipman, "Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures," in 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, Nov 2010, pp. 1--12.
[28]
A. Sodani, R. Gramunt, J. Corbal, H. S. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y. C. Liu, "Knights Landing: Second-Generation Intel Xeon Phi Product," IEEE Micro, vol. 36, no. 2, pp. 34--46, 2016.
[29]
G. Aupy, A. Benoit, B. Goglin, L. Pottier, and Y. Robert, "Co-scheduling HPC workloads on cache-partitioned CMP platforms," Inria, Research Report RR-9154, Feb 2018.
[30]
S. Bell, B. Edwards, J. Amann, R. Conlin, K. Joyce, V. Leung, J. MacKay, M. Reif, L. Bao, J. Brown, M. Mattina, C. C. Miao, C. Ramey, D. Wentzlaff, W. Anderson, E. Berger, N. Fairbanks, D. Khan, F. Montenegro, J. Stickney, and J. Zook, "TILE64 - Processor: A 64-Core SoC with Mesh Interconnect," in 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, Feb 2008, pp. 88--598.
[31]
I. E. Papazian, S. Kottapalli, J. Baxter, J. Chamberlain, G. Vedaraman, and B. Morris, "Ivy Bridge Server: A Converged Design," IEEE Micro, vol. 35, no. 2, pp. 16--25, 2015.
[32]
R. Graham, E. Lawler, J. Lenstra, and A. Kan, "Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey," in Discrete Optimization II, ser. Annals of Discrete Mathematics, P. Hammer, E. Johnson, and B. Korte, Eds. Elsevier, 1979, vol. 5, pp. 287 -- 326.
[33]
Y. Peng, C. Knight, P. Blood, L. Crosby, and G. A. Voth, "Extending Parallel Scalability of LAMMPS and Multiscale Reactive Molecular Simulations," in Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond, ser. XSEDE '12, 2012, pp. 37:1--37:7.
[34]
J. W. Hurrell, M. M. Holland, P. R. Gent, S. Ghan, J. E. Kay, P. J. Kushner, J.-F. Lamarque, W. G. Large, D. Lawrence, K. Lindsay, W. H. Lipscomb, M. C. Long, N. Mahowald, D. R. Marsh, R. B. Neale, P. Rasch, S. Vavrus, M. Vertenstein, D. Bader, W. D. Collins, J. J. Hack, J. Kiehl, and S. Marshall, "The Community Earth System Model: A Framework for Collaborative Research," Bulletin of the American Meteorological Society, vol. 94, no. 9, pp. 1339--1360, September 2013.
[35]
B. Zhang, T. Estrada, P. Cicotti, and M. Taufer, "Enabling In-Situ Data Analysis for Large Protein-Folding Trajectory Datasets," in Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, ser. IPDPS '14, 2014, pp. 221--230.
[36]
T. Peterka, J. Kwan, A. Pope, H. Finkel, K. Heitmann, S. Habib, J. Wang, and G. Zagaris, "Meshing the Universe: Integrating Analysis in Cosmological Simulations," in High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, Nov 2012, pp. 186--195.
[37]
P. Malakar, V. Vishwanath, T. Munson, C. Knight, M. Hereld, S. Leyffer, and M. E. Papka, "Optimal scheduling of in-situ analysis for large-scale scientific simulations," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2015.
[38]
J. Ahrens, S. Jourdain, P. OLeary, J. Patchett, D. H. Rogers, and M. Petersen, "An Image-Based Approach to Extreme Scale in Situ Visualization and Analysis," in SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, Nov 2014, pp. 424--434.
[39]
C. Docan, M. Parashar, and S. Klasky, "DataSpaces: An Interaction and Coordination Framework for Coupled Simulation Workflows," in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC '10, 2010, pp. 25--36.
[40]
Q. Liu, J. Logan, Y. Tian, H. Abbasi, N. Podhorszki, J. Y. Choi, S. Klasky, R. Tchoua, J. Lofstead, R. Oldfield, M. Parashar, N. Samatova, K. Schwan, A. Shoshani, M. Wolf, K. Wu, and W. Yu, "Hello adios: the challenges and lessons of developing leadership class i/o frameworks," Concurrency and Computation: Practice and Experience, vol. 26, no. 7, pp. 1453--1473, 2014. {Online}. Available
[41]
F. Zheng, H. Zou, G. Eisenhauer, K. Schwan, M. Wolf, J. Dayal, T.-A. Nguyen, J. Cao, H. Abbasi, S. Klasky, N. Podhorszki, and H. Yu, "FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics," in Parallel Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, May 2013, pp. 320--331.
[42]
H. Abbasi, M. Wolf, G. Eisenhauer, S. Klasky, K. Schwan, and F. Zheng, "DataStager: scalable data staging services for petascale applications," Cluster Computing, vol. 13, no. 3, pp. 277--290, 2010.
[43]
F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K. Schwan, and M. Wolf, "PreDatA - preparatory data analytics on peta-scale machines," in Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, April 2010.
[44]
U. Ayachit, A. Bauer, B. Geveci, P. O'Leary, K. Moreland, N. Fabian, and J. Mauldin, "ParaView Catalyst: Enabling In Situ Data Analysis and Visualization," in Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, ser. ISAV2015, 2015.
[45]
B. Whitlock, J. M. Favre, and J. S. Meredith, "Parallel in Situ Coupling of Simulation with a Fully Featured Visualization System," in Proceedings of the 11th Eurographics Conference on Parallel Graphics and Visualization, ser. EGPGV '11, 2011, pp. 101--109.
[46]
U. Ayachit, A. Bauer, E. P. N. Duque, G. Eisenhauer, N. Ferrier, J. Gu, K. E. Jansen, B. Loring, Z. Lukic, S. Menon, D. Morozov, P. O'Leary, R. Ranjan, M. Rasquin, C. P. Stone, V. Vishwanath, G. H. Weber, B. Whitlock, M. Wolf, K. J. Wu, and E. W. Bethel, "Performance Analysis, Design Considerations, and Applications of Extreme-Scale In Situ Infrastructures," in SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016, pp. 921--932.
[47]
M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf, "Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O," in 2012 IEEE International Conference on Cluster Computing, Sept 2012, pp. 155--163.
[48]
A. Bhatele, T. Gamblin, S. H. Langer, P.-T. Bremer, E. W. Draeger, B. Hamann, K. E. Isaacs, A. G. Landge, J. A. Levine, V. Pascucci et al., "Mapping applications with collectives over sub-communicators on torus networks," in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for. IEEE, 2012, pp. 1--11.
[49]
G. Mercier and E. Jeannot, "Improving MPI Applications Performance on Multicore Clusters with Rank Reordering," in Proceedings of the 18th European MPI Users' Group Conference on Recent Advances in the Message Passing Interface, ser. EuroMPI'11, 2011, pp. 39--49.
[50]
G. Michelogiannakis, K. Z. Ibrahim, J. Shalf, J. J. Wilke, S. Knight, and J. P. Kenny, "APHiD: Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks," in 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2017, pp. 228--237.
[51]
T. Hoefler and M. Snir, "Generic topology mapping strategies for large-scale parallel architectures," in Proceedings of the International Conference on Supercomputing, ser. ICS '11, 2011, pp. 75--84.
[52]
T. Hoefler, E. Jeannot, and G. Mercier, "An Overview of Topology Mapping Algorithms and Techniques in High-Performance Computing," in High Performance Computing on Complex Environments. Wiley, 2014, pp. 75--94.
[53]
A. Bhatele, G. R. Gupta, L. V. Kale, and I. H. Chung, "Automated mapping of regular communication graphs on mesh interconnects," in 2010 International Conference on High Performance Computing, Dec 2010.
[54]
M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids. Oxford Science Publications, 1989.
[55]
L. DeRose, B. Homer, D. Johnson, S. Kaufmann, and H. Poxon, "Cray performance analysis tools," in Tools for High Performance Computing, M. Resch, R. Keller, V. Himmler, B. Krammer, and A. Schulz, Eds. Springer Berlin Heidelberg, 2008.
[56]
S. Plimpton, "Fast Parallel Algorithms for Short-Range Molecular Dynamics," Journal of Computational Physics, vol. 117, no. 1, pp. 1 -- 19, 1995.
[57]
S. Ito, K. Goto, and K. Ono, "Automatically optimized core mapping to subdomains of domain decomposition method on multicore parallel environments," Computers & Fluids, vol. 80, pp. 88 -- 93, 2013.
[58]
"LAMMPS Documentation," http://lammps.sandia.gov/doc/Manual.html.
[59]
T. Bjerregaard and S. Mahadevan, "A Survey of Research and Practices of Network-on-chip," ACM Comput. Surv., vol. 38, no. 1, 2006.
[60]
P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh, "Performance evaluation and design trade-offs for network-on-chip interconnect architectures," IEEE Transactions on Computers, vol. 54, no. 8, pp. 1025--1040, Aug 2005.
[61]
R. Haring, M. Ohmacht, T. Fox, M. Gschwind, D. Satterfield, K. Sugavanam, P. Coteus, P. Heidelberger, M. Blumrich, R. Wisniewski, A. Gara, G.-T. Chiu, P. Boyle, N. Chist, and C. Kim, "The IBM Blue Gene/Q Compute Chip," Micro, IEEE, vol. 32, no. 2, pp. 48--60, 2012.
[62]
D. Molka, D. Hackenberg, and R. Schöne, "Main Memory and Cache Performance of Intel Sandy Bridge and AMD Bulldozer," in Proceedings of the Workshop on Memory Systems Performance and Correctness, ser. MSPC '14, 2014.
[63]
S. Jahagirdar, V. George, I. Sodhi, and R. Wells, "Power management of the third generation Intel core micro architecture formerly codenamed Ivy Bridge," in 2012 IEEE Hot Chips 24 Symposium (HCS), 2012, pp. 1--49.
[64]
S. Saini, R. Hood, J. Chang, and J. Baron, "Performance Evaluation of an Intel Haswell-and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications," in 2016 IEEE 18th International Conference on High Performance Computing and Communications, 2016, pp. 1196--1203.
[65]
J. Hofmann, G. Hager, G. Wellein, and D. Fey, "An analysis of core-and chip-level architectural features in four generations of intel server processors," in High Performance Computing, J. M. Kunkel, R. Yokota, P. Balaji, and D. Keyes, Eds. Springer International Publishing, 2017, pp. 294--314.
[66]
"Top GREEN 500 Supercomputing Sites," https://www.top500.org/green500/list/2017/11/.
[67]
G. J. Colin de Verdière, "Computing element evolution towards exascale and its impact on legacy simulation codes," The European Physical Journal A, vol. 51, no. 12, p. 163, Dec 2015.
[68]
J. Balkind, M. McKeown, Y. Fu, T. Nguyen, Y. Zhou, A. Lavrov, M. Shahrad, A. Fuchs, S. Payne, X. Liang, M. Matl, and D. Wentzlaff, "OpenPiton: An Open Source Manycore Research Framework," in Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '16, 2016, pp. 217--232.
[69]
D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C. C. Miao, J. F. B. III, and A. Agarwal, "On-Chip Interconnection Architecture of the Tile Processor," IEEE Micro, vol. 27, no. 5, pp. 15--31, Sept 2007.
[70]
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar, "An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, Feb 2007, pp. 98--589.
[71]
A. Bhatele, N. Jain, Y. Livnat, V. Pascucci, and P.-T. Bremer, "Analyzing network health and congestion in dragonfly-based supercomputers," in IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016, pp. 93--102.
[72]
W. M. Brown, P. Wang, S. J. Plimpton, and A. N. Tharrington, "Implementing molecular dynamics on hybrid high performance computers - short range forces," Computer Physics Communications, vol. 182, no. 4, pp. 898 -- 911, 2011.
[73]
"MPI IO Benchmark," https://github.com/sc18auxdata/iobenchmark.
[74]
"Argonne Leadership Computing Facility's Supercomputer Mira," http://www.alcf.anl.gov/mira.
[75]
R. Fourer, D. M. Gay, and B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, 2nd ed. Duxbury Press, 2003.
[76]
"Mapfiles," https://github.com/sc18auxdata/mapping.
[77]
M. Pagel, H. Pritchard, K. McMahon, and A. Hilleary, "Performance and Functional Improvements in MPT Software for the Cray XT System," in In Proceedings of the Cray User Group Conference (CUG), 2007.
[78]
G. Lakner, I.-H. Chung, G. Cong, S. Fadden, N. Goracke, D. Klepacki, J. Lien, C. Pospiech, S. R. Seelam, and H.-F. Wen, IBM System Blue Gene Solution: Performance Analysis Tools. IBM Redbooks, 2008.
[79]
X. Yang, J. Jenkins, M. Mubarak, R. B. Ross, and Z. Lan, "Watch out for the Bully!: Job Interference Study on Dragonfly Network," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '16, 2016.
[80]
S. Chunduri, K. Harms, S. Parker, V. Morozov, S. Oshin, N. Cherukuri, and K. Kumaran, "Run-to-run Variability on Xeon Phi Based Cray XC Systems," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '17, 2017.
[81]
T. Groves, Y. Gu, and N. J. Wright, "Understanding Performance Variability on the Aries Dragonfly Network," in 2017 IEEE International Conference on Cluster Computing (CLUSTER), Sept 2017, pp. 809--813.
[82]
J. M. Brandt, E. Froese, A. C. Gentile, L. Kaplan, B. A. Allan, and E. J. Walsh, "Network Performance Counter Monitoring and Analysis on the Cray XC Platform," in In Proceedings of the Cray User Group Conference (CUG), 2016.
[83]
Cray, "Aries Hardware Counters," http://docs.cray.com/PDF/Aries_Hardware_Counters_S-0045-30.pdf, 2015, Cray Technical Documentation.
[84]
R. Thakur, W. Gropp, and E. Lusk, "Data sieving and collective I/O in ROMIO," in Frontiers of Massively Parallel Computation, 1999. Frontiers '99. The Seventh Symposium on the, Feb 1999, pp. 182--189.
[85]
P. Schwan, "Lustre: Building a File System for 1,000-node Clusters," in Proceedings of the 2003 Linux Symposium, 2003.
[86]
G. M. Shipman, D. A. Dillow, S. Oral, F. Wang, D. Fuller, J. Hill, and Z. Zhang, "Lessons Learned in Deploying the World's Largest Scale Lustre File System," in In Proceedings of the Cray User Group Conference (CUG), 2010.
[87]
M. Chaarawi and E. Gabriel, "Automatically Selecting the Number of Aggregators for Collective I/O Operations," in 2011 IEEE International Conference on Cluster Computing, Sept 2011, pp. 428--437.
[88]
M. Moore, P. Farrell, and B. Cernohous, "Lustre Lockahead: Early Experience and Performance using Optimized Locking," in In Proceedings of the Cray User Group Conference (CUG), 2017.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis
November 2018
932 pages

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2019

Check for updates

Author Tags

  1. MD simulation
  2. analysis
  3. mapping
  4. optimization

Qualifiers

  • Research-article

Conference

SC18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 55
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media