Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1654059.1654084acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Early performance evaluation of a "Nehalem" cluster using scientific and engineering applications

Published: 14 November 2009 Publication History

Abstract

In this paper, we present an early performance evaluation of a 624-core cluster based on the Intel® Xeon® Processor 5560 (code named "Nehalem-EP", and referred to as Xeon 5560 in this paper)---the third-generation quad-core architecture from Intel. This is the first processor from Intel with a non-uniform memory access (NUMA) architecture managed by on-chip integrated memory controller. It employs a point-to-point interconnect called the Intel® QuickPath Interconnect (QPI) between processors and to the input/output (I/O) hub. It also introduces to a quad-core architecture both Intel's hyper-threading technology (or simultaneous multi-threading, "SMT") and Intel® Turbo Boost Technology ("Turbo mode") that automatically allow processor cores to run faster than the base operating frequency if the processor is operating below rated power, temperature, and current specification limits. It can be engaged with any number of cores or logical processors enabled and active. We critically evaluate these features using the High Performance Computing Challenge (HPCC) benchmarks, NAS Parallel Benchmarks (NPB), and four full-scale scientific applications. We compare and contrast the results of a cluster based on the Xeon 5560 with an SGI® Altix® ICE 8200EX cluster of quad-core Intel® Xeon® 5472 Processor ("Xeon 5472" from here on) and another cluster of Intel® Xeon® 5462 Processor ("Xeon 5462"; the Xeon 5400 Series Processors are previous generation quad-core Intel processors and were code named Harpertown).

References

[1]
R. C. Murphy, P. M. Kogge, and A. Rodrigues, "The Characterization of Data Intensive Memory Workloads on Distributed PIM Systems," Intelligent Memory Systems 2000: 85--103.
[2]
Intel White Paper, "First the Tick, Now the Tock: Next Generation Intel® Microarchitecture (Nehalem)," Intel publication 0408/VP/HBD/PDF 319724-001US, April 2008.
[3]
Intel® Microarchitecture (Nehalem), www.intel.com/technology/architecture-silicon/next-gen/.
[4]
"An Introduction to the Intel® QuickPath Interconnect," Document Number: 320412, January 2009, www.intel.com/technology/quickpath/; www.intel.com/technology/quickpath/introduction.pdf.
[5]
K. J. Barker, K. Davis, A. Hoisie, D. J. Kerbyson, M. Lang, S. Pakin, J. C. Sancho, "A Performance Evaluation of the Nehalem Quad-core Processor for Scientific Computing," Parallel Processing Letters, Vol. 18, No. 4 (2008) 453--469.
[6]
S. Saini, D. Talcott, D. Jespersen, J. Djomehri, H. Jin, and R. Biswas, "Scientific Application-based Performance Comparison of SGI Altix 4700, IBM Power5+, and SGI ICE 8200 Supercomputers," Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, Austin, Texas, November 15--21, 2008.
[7]
SGI Altix ICE Integrated Blade Platform, www.sgi.com/pdfs/4008.pdf.
[8]
InfiniBand Trade Association, www.infinibandta.org/home.
[9]
Message Passing Toolkit (MPT) User's Guide, http://techpubs.sgi.com/library/manuals/3000/007-3773-003/pdf/007-3773-003.pdf.
[10]
TOP500 Supercomputing Sites, www.top500.org/.
[11]
Intel® 5400 Chipset---Technical Documents, www.intel.com/Products/Server/Chipsets/5400/5400-technicaldocuments.htm.
[12]
HPC Challenge Benchmarks, http://icl.cs.utk.edu/hpcc/.
[13]
NAS Parallel Benchmarks, www.nas.nasa.gov/Resources/Software/npb.html.
[14]
OVERFLOW-2. http://aaac.larc.nasa.gov/~buning/.
[15]
D. J. Mavriplis, M. J. Aftosmis, and M. Berger, "High Resolution Aerospace Applications using the NASA Columbia Supercomputer," Proceedings of the 2005ACM/IEEE Conference on Supercomputing, Seattle, Washington, Nov. 12--18, 2005.
[16]
USM3D, http://aaac.larc.nasa.gov/tsab/usm3d/usm3d_52_man.html.
[17]
METIS Family of Multilevel Partitioning Algorithms, www.cs.umn.edu/~metis.
[18]
ECCO: Estimating the Circulation and Climate of the Ocean, www.ecco-group.org/.
[19]
Intel® MPI Library 3.2 Support Resources, Intel® Software Network, www.intel.com/cd/software/products/asmona/eng/308292.htm.
[20]
Intel® Math Kernel Library 10.1 Overview, Intel® Software Network, www.intel.com/cd/software/products/asmona/eng/307757.htm.
[21]
FFTW, FFTW to Intel® Math Kernel Library Wrappers, Technical User Notes, www.fftw.org/.

Cited By

View all
  • (2015)Scalable black-box prediction models for multi-dimensional adaptation on NUMA multi-coresInternational Journal of Parallel, Emergent and Distributed Systems10.1080/17445760.2014.89534630:3(193-210)Online publication date: 1-May-2015
  • (2014)A performance comparison of current HPC systemsFuture Generation Computer Systems10.5555/2747903.274819530:C(291-304)Online publication date: 1-Jan-2014
  • (2014)A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systemsFuture Generation Computer Systems10.1016/j.future.2013.06.01930(291-304)Online publication date: Jan-2014
  • Show More Cited By

Index Terms

  1. Early performance evaluation of a "Nehalem" cluster using scientific and engineering applications

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
    November 2009
    778 pages
    ISBN:9781605587448
    DOI:10.1145/1654059
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 November 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    SC '09
    Sponsor:

    Acceptance Rates

    SC '09 Paper Acceptance Rate 59 of 261 submissions, 23%;
    Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Scalable black-box prediction models for multi-dimensional adaptation on NUMA multi-coresInternational Journal of Parallel, Emergent and Distributed Systems10.1080/17445760.2014.89534630:3(193-210)Online publication date: 1-May-2015
    • (2014)A performance comparison of current HPC systemsFuture Generation Computer Systems10.5555/2747903.274819530:C(291-304)Online publication date: 1-Jan-2014
    • (2014)A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systemsFuture Generation Computer Systems10.1016/j.future.2013.06.01930(291-304)Online publication date: Jan-2014
    • (2014)Performance Evaluation of the Intel Sandy Bridge Based NASA Pleiades Using Scientific and Engineering ApplicationsHigh Performance Computing Systems. Performance Modeling, Benchmarking and Simulation10.1007/978-3-319-10214-6_2(25-51)Online publication date: 1-Oct-2014
    • (2013)Unified performance and power modeling of scientific workloadsProceedings of the 1st International Workshop on Energy Efficient Supercomputing10.1145/2536430.2536435(1-8)Online publication date: 17-Nov-2013
    • (2013)An early performance evaluation of many integrated core architecture based SGI rackable computing systemProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503272(1-12)Online publication date: 17-Nov-2013
    • (2013)Tracking the Performance Evolution of Blue Gene SystemsSupercomputing10.1007/978-3-642-38750-0_24(317-329)Online publication date: 2013
    • (2012)Comparing the Performance of Blue Gene/Q with Leading Cray XE6 and InfiniBand SystemsProceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems10.1109/ICPADS.2012.81(556-563)Online publication date: 17-Dec-2012
    • (2012)Performance evaluation of hybrid programming patterns for large CPU/GPU heterogeneous clustersComputer Physics Communications10.1016/j.cpc.2012.01.019183:6(1172-1181)Online publication date: Jun-2012
    • (2011)An early performance analysis of POWER7-IH HPC systemsProceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2063384.2063440(1-11)Online publication date: 12-Nov-2011
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media