Kota et al., 2005 - Google Patents

Horus: Large-scale symmetric multiprocessing for opteron systems

Kota et al., 2005

Document ID: 12817867863997044935
Author: Kota R; Oehler R
Publication year: 2005
Publication venue: IEEE Micro

External Links

Cited by

Snippet

Horus lets server vendors design up to 32-way Opteron systems. Horus is the only chip that targets the Opteron in an SMP implementation. By implementing a local directory structure to filter unnecessary probes and by offering 64 Mbytes of remote data cache, the chip …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

230000001427 coherent 0 abstract description 5

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0817—Cache consistency protocols using directory methods
- G06F12/0826—Limited pointers directories; State-only directories without pointers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
- G06F12/0835—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means for main memory peripheral accesses (e.g. I/O or DMA)
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17356—Indirect interconnection networks
- G06F15/17368—Indirect interconnection networks non hierarchical topologies
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/30087—Synchronisation or serialisation instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1012—Design facilitation

Similar Documents

Publication	Publication Date	Title
Charlesworth	1998	Starfire: extending the SMP envelope
US7577727B2 (en)	2009-08-18	Dynamic multiple cluster system reconfiguration
US10394747B1 (en)	2019-08-27	Implementing hierarchical PCI express switch topology over coherent mesh interconnect
Jerger et al.	2008	Circuit-switched coherence
US7613882B1 (en)	2009-11-03	Fast invalidation for cache coherency in distributed shared memory system
Abts et al.	2007	The Cray BlackWidow: a highly scalable vector multiprocessor
JP3980488B2 (en)	2007-09-26	Massively parallel computer system
JP2006120147A (en)	2006-05-11	Method and device for supporting multiple configuration by multiprocessor system
US10528519B2 (en)	2020-01-07	Computing in parallel processing environments
Ebrahimi et al.	2011	Agent-based on-chip network using efficient selection method
CN113448913A (en)	2021-09-28	System, apparatus and method for performing remote atomic operations via an interface
Kota et al.	2005	Horus: Large-scale symmetric multiprocessing for opteron systems
Gao et al.	2010	System architecture of Godson-3 multi-core processors
Laudon et al.	1997	System overview of the SGI Origin 200/2000 product line
Sharma	2023	Novel composable and scaleout architectures using compute express link
US7320048B2 (en)	2008-01-15	Apparatus and method to switch a FIFO between strobe sources
US7337279B2 (en)	2008-02-26	Methods and apparatus for sending targeted probes
Ros et al.	2010	Emc 2: Extending magny-cours coherence for large-scale servers
Walters et al.	2015	The IBM z13 processor cache subsystem
Xu et al.	2011	Explorations of optimal core and cache placements for chip multiprocessor
BanaiyanMofrad et al.	2012	A novel NoC-based design for fault-tolerance of last-level caches in CMPs
Iyer et al.	2000	Design and evaluation of a switch cache architecture for CC-NUMA multiprocessors
Manian et al.	2019	OMB-UM: Design, implementation, and evaluation of CUDA unified memory aware MPI benchmarks
Lodde et al.	2012	Heterogeneous network design for effective support of invalidation-based coherency protocols
Dai	2021	Reverse Engineering the Intel Cascade Lake Mesh Interconnect