Cache Coherence Protocols must enforece two rules:

  • Write propagation: Writes eventually become visible to all processors. 写操作最终所有处理器均可见。
  • Write serialization: Writes to the same location are serialized (all processors see them in the same order). 写操作的顺序应当保持一致。

How to ensure write propagation?

  • Write-invalidate protocols: Invalidate all other cached copied before performing the write. 执行写操作前,使其他所有的缓存备份失效。
  • Write-update protocols: Update all other cached copies after performing the write. 执行写操作之后,更新其他所有缓存备份。

How to ensure write serialization?

  • Snooping-based protocols: All caches observe each other's actions through a shared bus. 所有缓存通过共享总线观测其他缓存的行为。
  • Directory-based protocols: A coherence directory tracks contents of private caches and serializes requests. 一致性目录跟踪专用缓存的内容并对请求序列化。

Snooping-Based Coherence

  • There are many processors running in parallel, the caches are connected through a shared bus, and then connected to the main mamory.
  • If cache-hit, then cache return the data. If cache-miss, then go to the main memory and fetch the data.
  • Snoopy cache watch (snoop on) bus to keep all processors' view of memory coherent. (cache have to listen to both processor and shared bus.)

How to achieve?

  • Bus provides serialization point
    • Broadcast, totally ordered
    • Each cache controller "snoops" all bus transactions
    • Controller updates state of cache in response to processor and snoop events and generates bus transactions
  • Snoopy protocol (Finite State Machine, FSM)
    • State-transition diagram
    • Actions

A Simple Protocol: Valid/Invalid (VI)

  • VI Drawbacks:
    • Every write updates main memory
    • Every write requires broadcast & snoop

Maintaining Coherence

  • In a coherent memory all loads and stores can be placed in a global order
    • However, multiple copies of an address in various caches can cause this property to be violated 多个地址的拷贝可能导致负载和存储不能在全局按序存储(导致不一致)
  • This property can be ensured if:
    • Only one cache at a time has the write permission for an address 一次仅一个缓存具有对地址的写许可权
    • No cache can have a stale copy of the data after a write to the address has been performed 执行写入地址后,任何高速缓存都无法拥有数据的陈旧副本(Write-invalidate)

Modifed/Shared/Invalid(MSI) Protocol

MSI is a little different from the VI protocol.

  • Each line in each cache maintains MSI state:
    • I - cache doesn't contain the address “失效”状态表示该数据块是否已有最新值(失效说明数据块已经被其他processor修改)
    • S - cache has the address but so may other caches; hence it can only be read “共享”状态表示改数据没有被修改过,被多个cache读取
    • M - only this cache has the address; hence it can be read and written    - any other cache that had this address got invalidated “修改”状态表示cache可以对该地址进行读/写操作
  • VI Drawbacks: Every write updates main memory, and every write requires broadcast & snoop
  • MSI: Allows writeback caches + satisfies writes locally

MSI Optimizations: Exclusive State

  • Observation: Doing read-modify-write sequences on private data is common
    • What's the problem with MSI?
      • 2 bus transactions for every read-modify-write of private data
  • Solution: E state (exclusive, clean)
    • If no other sharers, a read acquires line in E instead of S
    • Writes silently cause E -> M (exclusive, dirty)
       

MESI: An Enhanced MSI protocol

  • Increased performance for private read-write data 解决private cache更新浪费带宽的问题

 

 

Directory-Based Coherence

  • Motivation: Snoopy的bus往往是性能瓶颈,随着processors的增加,bus会变得拥堵。如果有n个CPU,就需要支持n倍带宽,并且需要每一个CPU处理其他CPU的所有信息,即处理N^2的信息。
  • Route all coherence transactions through a directory,其他processor通过访问directory来判断该memory是否有自己需要的数据块
    • Tracks contents of private caches -> No broadcasts,只对自己的cache请求,维护自己被分配到的memory
    • Serves as ordering point for conflicting requests -> Unordered networks(这句没懂?)

Ref

  1. 体系结构学习15-cache coherence
  2. MIT 6.004 L25_ Cache Coherence

 

更多推荐

体系结构:Cache Coherence