AC-WAR: Architecting the Cache Hierarchy to Improve the Lifetime of an Non-volatile Endurance-limited Main Memory

Transactions on Parallel and Distributed Systems, IEEE.


This work shows how by adapting replacement policies in contemporary cache hierarchies it is possible to extend the lifespan of a write endurance-limited main memory by almost one order of magnitude. The inception of this idea is that during cache residency (1) blocks are modified in a bimodal way: either most of the content of the block is modified or most of the content of the block never changes, and (2) in most applications, the majority of blocks are only slightly modified. When those facts are considered by the cache replacement algorithms, it is possible to significantly reduce the number of bit-flips per write-back to main memory. Our proposal favors the off-chip eviction of slightly modified blocks according to an adaptive replacement algorithm that operates coordinately in L2 and L3. This way it is possible to improve significantly system memory lifetime, with negligible performance degradation. We found that using a few bits per block to track changes in cache blocks with respect to the main memory content is enough. With a slightly modified sectored LRU and a simple cache performance predictor it is possible to achieve a simple implementation with minimal cost in area and no impact on cache access time.

On average, our proposal increases the memory lifetime obtained with an LRU policy up to ten times (10x) and fifteen times (15x) when combined with other memory centric techniques. In both cases, the performance degradation could be considered negligible.