Lec15 Snoop Coherence
Lec15 Snoop Coherence
Lec15 Snoop Coherence
Vassilis Papaefstathiou
Iakovos Mavroidis
Processor Processor
Output Output
Control Control
Memory Memory
# of Proc
Communication Message passing 8 to 2048
model Shared NUMA 8 to 256
address UMA 2 to 64
Physical Network 8 to 256
connection Bus 2 to 36
Single Bus (Shared Address UMA) Multi’s
Proc1 Proc2 Proc3 Proc4
Single Bus
Memory I/O
q Caches are used to reduce latency and to lower bus traffic
● Write-back caches used to keep bus traffic at a minimum
q Must provide hardware to ensure that caches and memory
are consistent (cache coherency)
q Must provide a hardware mechanism to support process
synchronization
Multiprocessor Cache Coherency
q Cache coherency protocols
● Bus snooping – cache controllers monitor shared bus traffic with
duplicate address tag hardware (so they don’t interfere with
processor’s access to the cache)
Single Bus
Memory I/O
Bus Snooping Protocols
q Multiple copies are not a problem when reading
q Processor must have exclusive access to write a word
● What happens if two processors try to write to the same shared
data word in the same clock cycle? The bus arbiter decides
which processor gets the bus first (and this will be the
processor with the first exclusive access). Then the second
processor will get exclusive access. Thus, bus arbitration
forces sequential behavior.
● This sequential consistency is the most conservative of the
memory consistency models. With it, the result of any
execution is the same as if the accesses of each processor
were kept in order and the accesses among different
processors were interleaved.
q All other processors sharing that data must be informed
of writes
Handling Writes
Ensuring that all other processors sharing data are
informed of writes can be handled two ways:
1. Write-update (write-broadcast) – writing processor
broadcasts new data over the bus, all copies are
updated
● All writes go to the bus ® higher bus traffic
● Since new values appear in caches sooner, can reduce latency
2. Write-invalidate – writing processor issues invalidation
signal on bus, cache snoops check to see if they have a
copy of the data, if so they invalidate their cache block
containing the word (this allows multiple readers but
only one writer)
● Uses the bus only on the first write ® lower bus traffic, so better
use of bus bandwidth
A Write-Invalidate CC Protocol
read (hit or
miss)
read (miss)
Shared
Invalid
(clean)
write (miss)
write-back caching
Modified protocol in black
(dirty)
to this block
A S A I A S A I
A M A I A M A I
q No other copies
● Value read from memory to local cache (?)
● Value updated
● Local copy state set to M
MESI Local Write Miss (2)
q Other copies, either one in state E or more in state S
● Value read from memory to local cache - bus transaction marked
RWITM (read with intent to modify)
● Snooping processors see this and set their copy state to I
● Local copy updated & state set to M
MESI Local Write Miss (3)
Another copy in state M
q Processor issues bus transaction marked RWITM
q Snooping processor sees this
● Blocks RWITM request
● Takes control of bus
● Writes back its copy to memory
● Sets its copy state to I
MESI Local Write Miss (4)
Another copy in state M (continued)
q Original local processor re-issues RWITM request
q Is now simple no-copy case
● Value read from memory to local cache
● Local copy value updated
● Local copy state set to M
Putting it all together
q All of this information can be described compactly using
a state transition diagram
q Diagram shows what happens to a cache line in a
processor as a result of
● memory accesses made by that processor (read hit/miss, write
hit/miss)
● memory accesses made by other processors that result in bus
transactions observed by this snoopy cache (Mem read,
RWITM,Invalidate)
MESI – locally initiated accesses
Read
Miss(SH) Read
Invalid Mem Read Shared Hit
Mem Read
Read Invalidate
RWITM Miss(EX) Write
Write Hit
Miss
Read Read
Modified Exclusive Hit
Hit Write
Hit
Mem Read
Invalidate
Invalid Shared
Mem Read
RWITM Mem Read RWITM
Modified Exclusive
= copy back
MESI notes