Computer Architecture : Cache Coherence and its Types

Cache coherence is the regularity or consistency of data stored in cache memory. In a shared memory multiprocessor system with a separate cache memory for each processor, it is possible to have many copies of shared data: one copy in the main memory and one in the local cache of each processor that requested it. When one of the copies of data is changed, the other copies must reflect that change. Cache coherence is the discipline which ensures that the changes in the values of shared operands (data) are propagated throughout the system in a timely fashion. As we know there are two write policies :

  • Write back : Write operations are usually made only to the cache. Main memory is only updated when the corresponding cache line is flushed from the cache.
  • Write through : All write operations are made to main memory as well as to the cache, ensuring that main memory is always valid.

It is clear that a write back policy can result in inconsistency. If two caches contain the same line, and the line is updated in one cache, the other cache will unknowingly have an invalid value. Subsequently read to that invalid line produce invalid results. Even with the write through policy, inconsistency can occur unless other cache monitor the memory traffic or receive some direct notification of the update.

Maintaining cache and memory consistency is imperative for multiprocessors or distributed shared memory (DSM) systems. Cache management is structured to ensure that data is not overwritten or lost. Different techniques may be used to maintain cache coherency, including directory based coherence, bus snooping and snarfing. To maintain consistency, a DSM system imitates these techniques and uses a coherency protocol, which is essential to system operations. Cache coherence is also known as cache coherency or cache consistency.

For any cache coherence protocol, the objective is to let recently used local variables get into the appropriate cache and stay there through numerous reads and write, while using the protocol to maintain consistency of shared variables that might be in multiple caches at the same time.

Write through protocol: 

A write through protocol can be implemented in two fundamental versions.
  • Write through with update protocol : When a processor writes a new value into its cache, the new value is also written into the memory module that holds the cache block being changed. Some copies of this block may exist in other caches, these copies must be updated to reflect the change caused by the write operation. The simplest way of doing this is to broadcast the written data to all processor modules in the system. As each processor module receives the broadcast data, it updates the contents of the affected cache block if this block is present in its cache.
  • Write through with invalidation of copies : When a processor writes a new value into its cache, this value is written into the memory module, and all copies in the other caches are invalidated. Again broadcasting can be used to send the invalidation requests through the system.

Write back protocol:

In the write-back protocol, multiple copies of a cache block may exist if different processors have loaded (read) the block into their caches. If some processor wants to change this block, it must first become an exclusive owner of this block. When the ownership is granted to this processor by the memory module that is the home location of the block. All other copies, including the one in the memory module, are invalidated. Now the owner of the block may change the contents of the memory. When another processor wishes to read this block, the data are sent to this processor by the current owner. The data are also sent to the home memory module, which requires ownership and updates the block to contain the latest value.

There are software and hardware solutions for cache coherence problem : 

Software Solutions :

In software approach, the detecting of potential cache coherence problem is transferred from run time to compile time, and the design complexity is transferred from hardware to software. 
On the other hand, compile time; software approaches generally make conservative decisions. Compiler-based cache coherence mechanism perform an analysis on the code to determine which data items may become unsafe for caching, and they mark those items accordingly. So, there are some more cachable items, and the operating system or hardware does not cache those items. The simplest approach is to prevent any shared data variables from being cached. This is too conservative, because a shared data structure may be exclusively used during some periods and may be effectively read-only during other periods. More efficient approaches analyze the code to determine safe periods for shared variables. The compiler then inserts instructions into the generated code to enforce cache coherence during the critical periods. 

Hardware Solutions :

Hardware solution provide dynamic recognition at run time of potential inconsistency conditions. Because the problem is only dealt with when it actually arises, there is more effective use of caches, leading to improved performances over a software approach. Hardware schemes can be divided into two categories: snooping and Dictonary-based protocols.

  • Snooping :  Snooping is a process where the individual caches monitor address lines for accesses to memory locations that they have cached.[4] The write-invalidate protocols and write-update protocols make use of this mechanism. For the snooping mechanism, a snoop filter reduces the snooping traffic by maintaining a plurality of entries, each representing a cache line that may be owned by one or more nodes. When replacement of one of the entries is required, the snoop filter selects for the replacement the entry representing the cache line or lines owned by the fewest nodes, as determined from a presence vector in each of the entries. A temporal or other type of algorithm is used to refine the selection if more than one cache line is owned by the fewest number of nodes.
  • Dictonary-based protocols : In a directory-based system, the data being shared is placed in a common directory that maintains the coherence between caches. The directory acts as a filter through which the processor must ask permission to load an entry from the primary memory to its cache. When an entry is changed, the directory either updates or invalidates the other caches with that entry.

Next Topic : 

No comments:

Post a Comment