Module 4
Module 4
Module 4
Chapter 7
Multiprocessors and Multicomputers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…
2
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Network Characteristics
o Topology
• Dynamic Networks
o Timing control protocol
• Synchronous (with global clock)
• Asynchronous (with handshake or interlocking mechanism)
o Switching method
• Circuit switching
• Packet switching
o Control Strategy
• Centralized (global controller to receive requests from all devices and grant network access)
• Distributed (requests handled by local devices independently)
4
MULTIPROCESSOR SYSTEM INTERCONNECTS
5
MULTIPROCESSOR SYSTEM INTERCONNECTS
7
MULTIPROCESSOR SYSTEM INTERCONNECTS
8
MULTIPROCESSOR SYSTEM INTERCONNECTS
9
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Protocol Approaches
o Snoopy Bus Protocol
o Directory Based Protocol
• Write Policies
o (Write-back, Write-through) x (Write-invalidate, Write-update)
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
• Store-and-forward routing
• Wormhole routing
MESSAGE PASSING SCHEMES
MESSAGE PASSING SCHEMES
• Asynchronous Pipelining
MESSAGE PASSING SCHEMES
• Latency Analysis
o L: Packet length (in bits)
o W: Channel Bandwidth (in bits per second)
o D: Distance (number of nodes traversed minus 1)
o F: Flit length (in bits)
o Communication Latency in Store-and-forward Routing
• TSF = L (D + 1) / W
o Communication Latency in Wormhole Routing
• TWH = L / W + F D / W
Advanced Computer Architecture
Chapter 8
Multivector and SIMD Computers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…
2
VECTOR PROCESSING PRINCIPLES
3
VECTOR PROCESSING PRINCIPLES
4
VECTOR PROCESSING PRINCIPLES
• Vector-Vector Instructions
o F1: Vi Vj
o F2: Vi x Vj Vk
o Examples: V1 = sin(V2) V3 = V1+ V2
• Vector-Scalar Instructions
o F3: s x Vi Vj
o Examples: V2 = 6 + V1
• Vector-Memory Instructions
o F4: MV (Vector Load)
o F5: VM (Vector Store)
o Examples: X = V1 V2 = Y
5
VECTOR PROCESSING PRINCIPLES
• Masking
o F10: Vi x Vm Vj (Vm is a binary vector)
• Examples…
6
VECTOR PROCESSING PRINCIPLES
• Vector Loops
o Vector segmentation or strip-mining approach
o Example
• Vector Chaining
o Example: SAXPY code
• Limited Chaining using only one memory-access pipe in Cray-I
• Complete Chaining using three memory-access pipes in Cray X-MP
• SIMD Instructions
o Scalar Operations
• Arithmetic/Logical
o Vector Operations
• Arithmetic/Logical
o Data Routing Operations
• Permutations, broadcasts, multicasts, rotation and shifting
o Masking Operations
• Enable/Disable PEs
• Host and I/O
• Bit-slice and Word-slice Processing
o WSBS, WSBP, WPBS, WPBP
Advanced Computer Architecture
Chapter 9
…Dataflow Architectures
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…
2
DATAFLOW AND HYBRID ARCHITECTURES
• Data-driven machines
• Evolution of Dataflow Machines
• Dataflow Graphs
o Dataflow Graphs examples.
o Activity Templates and Activity Store
o Example: dataflow graph for cos x
𝟔
𝒙𝟐 𝒙𝟒 𝒙
• 𝐜𝐨𝐬 𝐱 ≅ 𝟏 − + −
𝟐! 𝟒! 𝟔!
o More examples
3
DATAFLOW AND HYBRID ARCHITECTURES