MRR: Enabling Fully Adaptive Multicast Routing for CMP Interconnection Networks

Pablo Abad, Valentin Puente, Jose Angel Gregorio

Publication

International Conference on High Performance Computer Architecture

Date

January, 2009

Links

Pdf Slides

Abstract

This paper introduces a cost effective cache architecture called Enhanced Shared-Private Non-Uniform Cache Architecture (ESP-NUCA), which is suitable for high- performance Chip MultiProcessors (CMPs). This architecture enhances system stability by combining the ad- vantages of private and shared caches. Starting from a shared NUCA, ESP-NUCA introduces a low-cost mechanism to dynamically allocate private cache blocks closer to their owner processor. In this way, average on-chip access latency is reduced and inter-core interference minimized. ESP-NUCA synergistically integrates victims and replicas thus making it possible to take advantage of multiple- readers for shared data, and to maximize cache usage under unbalanced core utilization. This architecture leads to sta- ble behavior within the whole system across a broad spectrum of working scenarios. ESP-NUCA not only outperforms architectures with similar implementation costs such as private and shared caches by up to 20% and 40% respectively, but even outperforms much costlier architectures such as D-NUCA [13] by up to 28%, Adaptive Selective Replication [3] by up to 19%, and Cooperative Caching [5] by up to 15%. Moreover, performance variance throughout the set of benchmarks is 37% lower than with ASR, 87% lower than with D-NUCA, and 43% lower than with Cooperative Caching.