Nothing Special   »   [go: up one dir, main page]

Almási et al., 2004 - Google Patents

Implementing MPI on the BlueGene/L supercomputer

Almási et al., 2004

View PDF
Document ID
1469916293074940355
Author
Almási G
Archer C
Castanos J
Erway C
Heidelberger P
Martorell X
Moreira J
Pinnow K
Ratterman J
Smeds N
Steinmacher-Burow B
Gropp W
Toonen B
Publication year
Publication venue
Euro-Par 2004 Parallel Processing: 10th International Euro-Par Conference, Pisa, Italy, August 31-September 3, 2004. Proceedings 10

External Links

Snippet

The BlueGene/L supercomputer will consist of 65,536 dual-processor compute nodes interconnected by two high-speed networks: a three-dimensional torus network and a tree topology network. Each compute node can only address its own local memory, making …
Continue reading at www.academia.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks
    • G06F15/17368Indirect interconnection networks non hierarchical topologies
    • G06F15/17381Two dimensional, e.g. mesh, torus
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17337Direct connection machines, e.g. completely connected computers, point to point communication networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogramme communication; Intertask communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/1642Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing

Similar Documents

Publication Publication Date Title
US10887238B2 (en) High performance, scalable multi chip interconnect
Almási et al. Design and implementation of message-passing services for the Blue Gene/L supercomputer
Derradji et al. The BXI interconnect architecture
US8032892B2 (en) Message passing with a limited number of DMA byte counters
Petrini et al. The Quadrics network: High-performance clustering technology
Petrini et al. Performance evaluation of the quadrics interconnection network
US7788334B2 (en) Multiple node remote messaging
US7886084B2 (en) Optimized collectives using a DMA on a parallel computer
Almási et al. Implementing MPI on the BlueGene/L supercomputer
EP1615138A2 (en) Multiprocessor chip having bidirectional ring interconnect
US8756270B2 (en) Collective acceleration unit tree structure
TWI547870B (en) Method and system for ordering i/o access in a multi-node environment
US20090006296A1 (en) Dma engine for repeating communication patterns
Tipparaju et al. Host-assisted zero-copy remote memory access communication on infiniband
Papadopoulou et al. A performance study of UCX over InfiniBand
Muthukrishnan et al. Finepack: Transparently improving the efficiency of fine-grained transfers in multi-gpu systems
US11552907B2 (en) Efficient packet queueing for computer networks
Sack et al. Collective algorithms for multiported torus networks
Gao et al. Impact of reconfigurable hardware on accelerating mpi_reduce
Suresh et al. Network assisted non-contiguous transfers for GPU-aware MPI libraries
Afsahi et al. Efficient communication using message prediction for cluster of multiprocessors
Thorson et al. SGI® UV2: A fused computation and data analysis machine
Nüssle et al. Accelerate communication, not computation!
Dhanraj Enhancement of LiMIC-Based Collectives for Multi-core Clusters
Almási et al. Architecture and performance of the BlueGene/L message layer