Abstract
As detailed in recent reports, HPC architectures will continue to change over the next decade in an effort to improve energy efficiency, reliability, and performance. At this time of significant disruption, it is critically important to understand specific application requirements, so that these architectural changes can include features that satisfy the requirements of contemporary extreme-scale scientific applications. To address this need, we have developed a methodology supported by a toolkit that allows us to investigate detailed computation, memory, and communication behaviors of applications at varying levels of resolution. Using this methodology, we performed a broad-based, detailed characterization of 12 contemporary scalable scientific applications and benchmarks. Our analysis reveals numerous behaviors that sometimes contradict conventional wisdom about scientific applications. For example, the results reveal that only one of our applications executes more floating-point instructions than other types of instructions. In another example, we found that communication topologies are very regular, even for applications that, at first glance, should be highly irregular. These observations emphasize the necessity of measurement-driven analysis of real applications, and help prioritize features that should be included in future architectures.
Support for this work was provided by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research. The work was performed at the Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 to the U.S. Government. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dongarra, J., Beckman, P., Moore, T., Aerts, P., Aloisio, G., Andre, J.C., Barkai, D., Berthou, J.Y., Boku, T., Braunschweig, B., Cappello, F., Chapman, B., Chi, X., Choudhary, A., Dosanjh, S., Dunning, T., Fiore, S., Geist, A., Gropp, B., Harrison, R., Hereld, M., Heroux, M., Hoisie, A., Hotta, K., Jin, Z., Ishikawa, Y., Johnson, F., Kale, S., Kenway, R., Keyes, D., Kramer, B., Labarta, J., Lichnewsky, A., Lippert, T., Lucas, B., Maccabe, B., Matsuoka, S., Messina, P., Michielse, P., Mohr, B., Mueller, M.S., Nagel, W.E., Nakashima, H., Papka, M.E., Reed, D., Sato, M., Seidel, E., Shalf, J., Skinner, D., Snir, M., Sterling, T., Stevens, R., Streitz, F., Sugar, B., Sumimoto, S., Tang, W., Taylor, J., Thakur, R., Trefethen, A., Valero, M., van der Steen, A., Vetter, J., Williams, P., Wisniewski, R., Yelick, K.: The international exascale software project roadmap. International Journal of High Performance Computing Applications 25(1), 3–60 (2011)
Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., Hiller, J., Karp, S., Keckler, S., Klein, D., Lucas, R., Richards, M., Scarpelli, A., Scott, S., Snavely, A., Sterling, T., Williams, R.S., Yelick, K.: Exascale computing study: Technology challenges in achieving exascale systems. Technical report, DARPA Information Processing Techniques Office (2008)
Snir, M., Gropp, W.D., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W. (eds.): MPI-the complete reference (2-volume set) 2nd edn. Scientific and Engineering Computation. MIT Press, Cambridge (1998)
Asanovic, K., Bodik, R., Catanzaro, B., Gebis, J., Husbands, P., Keutzer, K., Patterson, D., Plishker, W., Shalf, J., Williams, S.: The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)
Vetter, J.S., Yoo, A.: An empirical performance evaluation of scalable scientific applications. In: SC 2002, Baltimore, MD, USA. IEEE (2002)
Shalf, J., Kamil, S., Oliker, L., Skinner, D.: Analyzing ultra-scale application communication requirements for a reconfigurable hybrid interconnect. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 17. IEEE Computer Society (2005)
Brightwell, R., Underwood, K.D.: An analysis of the impact of mpi overlap and independent progress. In: Proceedings of the 18th Annual International Conference on Supercomputing, Malo, France, pp. 298–305. ACM (2004)
Riesen, R.: Communication patterns. In: 20th International Parallel and Distributed Processing Symposium (IPDPS), 8 p. (2006)
Vetter, J.S., Mueller, F.: Communication characteristics of large-scale scientific applications for contemporary cluster architectures. In: International Parallel and Distributed Processing Symposium (IPDPS), Ft. Lauderdale, Florida (2002)
Vetter, J.S., Glassbrook, R., Dongarra, J., Schwan, K., Loftis, B., McNally, S., Meredith, J., Rogers, J., Roth, P., Spafford, K., Yalamanchili, S.: Keeneland: Bringing heterogeneous GPU computing to the computational science community. IEEE Computing in Science and Engineering 13(5), 90–95 (2011)
Dongarra, J.J., Luszczek, P.: Introduction to the hpcchallenge benchmark suite. Technical Report ICL-UT-05-01, Innovative Computing Laboratory, University of Tennessee-Knoxville (2005)
Brown, P.N., Falgout, R.D., Jones, J.E.: Semicoarsening multigrid on distributed memory machines. SIAM Journal on Scientific Computing 21(5), 1823–1834 (2000)
Smith, M.A., Marin-Lafleche, A., Yang, W.S., Kaushik, D., Siegel, A.: Method of characteristics development targeting the high performance Blue Gene/P computer at argonne national laboratory. In: Proceedings of the International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering (MC 2011). American Nuclear Society (2011)
Karlin, I., Bhatele, A., Chamberlain, B.L., Cohen, J., Devito, Z., Gokhale, M., Haque, R., Hornung, R., Keasler, J., Laney, D., Luke, E., Lloyd, S., McGraw, J., Neely, R., Richards, D., Schulz, M., Still, C.H., Wang, F., Wong, D.: Lulesh programming model and performance ports overview. Technical Report LLNL-TR-608824, Lawrence Livermore National Laboratory (December 2012)
Chen, J.H., Choudhary, A., de Supinski, B., DeVries, M., Hawkes, E.R., Klasky, S., Liao, W.K., Ma, K.L., Mellor-Crummey, J., Podhorszki, N., Sankaran, R., Shende, S., Yoo, C.S.: Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science and Discovery 2(1) (2009)
Spafford, K.L., Meredith, J.S., Vetter, J.S., Chen, J., Grout, R., Sankaran, R.: Accelerating S3D: A GPGPU case study. In: HeteroPar 2009: Proceedings of the Seventh International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms (2009)
Germann, T.C., Kadau, K.: Trillion-atom molecular dynamics becomes a reality. International Journal of Modern Physics C 19(09), 1315–1319 (2008)
Lee, W.W.: Gyrokinetic approach in particle simulation. Physics of Fluids 26, 556–562 (1983)
Richards, D.F., Glosli, J.N., Chan, B., Dorr, M.R., Draeger, E.W., Fattebert, J.L., Krauss, W.D., Spelce, T., Streitz, F.H., Surh, M.P., Gunnels, J.A.: Beyond homogeneous decomposition: Scaling long-range forces on massively parallel systems. In: Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis, SC 2009. ACM, New York (2009)
Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics 117, 1–19 (1995)
Fischer, P., Lottes, J., Kerkemeier, S.: Nek5000 website (2008)
Smith, R.D., Dukowicz, J.K., Malone, R.C.: Parallel ocean general circulation modeling. Physica D 60(1–4), 38–61 (1992)
Collins, W.D., Blackmon, M.L., Bonan, G.B., Hack, J.J., Henderson, T.B., Kielh, J.T., Large, W.G., McKenna, D.S., Bitz, C.M., Bretherton, C.S., Carton, J.A., Chang, P., Doney, S.C., Santer, B.D., Smith, R.D.: The Community Climate System Model version 3 (CCSM3). Journal of Climate 19(11), 2122–2143 (2006)
Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2005, pp. 190–200. ACM, New York (2005)
Intel Corporation: XED, http://software.intel.com/sites/landingpage/pintool/docs/53271/Xed/html
Intel Corporation: Intel Architecture software developer’s manual, vol. 1: basic architecture (1999)
Advanced Micro Devices Inc: 3DNow! technology manual (2000)
Intel Corporation: Intel SSE4 programming reference (April 2007)
Browne, S., Dongarra, J., Garner, N., London, K., Mucci, P.: A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications 14, 189–204 (2000)
Ding, C., Zhong, Y.: Predicting whole-program locality through reuse distance analysis. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (2003)
Schuff, D.L., Parsons, B.S., Pai, V.S.: Multicore-aware reuse distance analysis. In: Workshop on Performance Modeling, Evaluation, and Optimization of Ubiquitous Computing and Networked Systems (2010)
Ding, C., Zhong, Y.: Reuse distance analysis. Technical Report UR-CS-TR-741, Computer Science Department, University of Rochester (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Vetter, J.S. et al. (2014). Quantifying Architectural Requirements of Contemporary Extreme-Scale Scientific Applications. In: Jarvis, S., Wright, S., Hammond, S. (eds) High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation. PMBS 2013. Lecture Notes in Computer Science(), vol 8551. Springer, Cham. https://doi.org/10.1007/978-3-319-10214-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-10214-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10213-9
Online ISBN: 978-3-319-10214-6
eBook Packages: Computer ScienceComputer Science (R0)