Cache Coherence Protocols must enforece two rules:
- Write propagation: Writes eventually become visible to all processors. 写操作最终所有处理器均可见。
- Write serialization: Writes to the same location are serialized (all processors see them in the same order). 写操作的顺序应当保持一致。
How to ensure write propagation?
- Write-invalidate protocols: Invalidate all other cached copied before performing the write. 执行写操作前,使其他所有的缓存备份失效。
- Write-update protocols: Update all other cached copies after performing the write. 执行写操作之后,更新其他所有缓存备份。
How to ensure write serialization?
- Snooping-based protocols: All caches observe each other's actions through a shared bus. 所有缓存通过共享总线观测其他缓存的行为。
- Directory-based protocols: A coherence directory tracks contents of private caches and serializes requests. 一致性目录跟踪专用缓存的内容并对请求序列化。
Snooping-Based Coherence
- There are many processors running in parallel, the caches are connected through a shared bus, and then connected to the main mamory.
- If cache-hit, then cache return the data. If cache-miss, then go to the main memory and fetch the data.
- Snoopy cache watch (snoop on) bus to keep all processors' view of memory coherent. (cache have to listen to both processor and shared bus.)
How to achieve?
- Bus provides serialization point
- Broadcast, totally ordered
- Each cache controller "snoops" all bus transactions
- Controller updates state of cache in response to processor and snoop events and generates bus transactions
- Snoopy protocol (Finite State Machine, FSM)
- State-transition diagram
- Actions
A Simple Protocol: Valid/Invalid (VI)
- VI Drawbacks:
- Every write updates main memory
- Every write requires broadcast & snoop
Maintaining Coherence
- In a coherent memory all loads and stores can be placed in a global order
- However, multiple copies of an address in various caches can cause this property to be violated 多个地址的拷贝可能导致负载和存储不能在全局按序存储(导致不一致)
- This property can be ensured if:
- Only one cache at a time has the write permission for an address 一次仅一个缓存具有对地址的写许可权
- No cache can have a stale copy of the data after a write to the address has been performed 执行写入地址后,任何高速缓存都无法拥有数据的陈旧副本(Write-invalidate)
Modifed/Shared/Invalid(MSI) Protocol
MSI is a little different from the VI protocol.
- Each line in each cache maintains MSI state:
- I - cache doesn't contain the address “失效”状态表示该数据块是否已有最新值(失效说明数据块已经被其他processor修改)
- S - cache has the address but so may other caches; hence it can only be read “共享”状态表示改数据没有被修改过,被多个cache读取
- M - only this cache has the address; hence it can be read and written - any other cache that had this address got invalidated “修改”状态表示cache可以对该地址进行读/写操作
- VI Drawbacks: Every write updates main memory, and every write requires broadcast & snoop
- MSI: Allows writeback caches + satisfies writes locally
MSI Optimizations: Exclusive State
- Observation: Doing read-modify-write sequences on private data is common
- What's the problem with MSI?
- 2 bus transactions for every read-modify-write of private data
- What's the problem with MSI?
- Solution: E state (exclusive, clean)
- If no other sharers, a read acquires line in E instead of S
- Writes silently cause E -> M (exclusive, dirty)
MESI: An Enhanced MSI protocol
- Increased performance for private read-write data 解决private cache更新浪费带宽的问题
Directory-Based Coherence
- Motivation: Snoopy的bus往往是性能瓶颈,随着processors的增加,bus会变得拥堵。如果有n个CPU,就需要支持n倍带宽,并且需要每一个CPU处理其他CPU的所有信息,即处理N^2的信息。
- Route all coherence transactions through a directory,其他processor通过访问directory来判断该memory是否有自己需要的数据块
- Tracks contents of private caches -> No broadcasts,只对自己的cache请求,维护自己被分配到的memory
- Serves as ordering point for conflicting requests -> Unordered networks(这句没懂?)
Ref
- 体系结构学习15-cache coherence
- MIT 6.004 L25_ Cache Coherence
更多推荐
体系结构:Cache Coherence
发布评论