Multi-destination communications are a highlynecessary capability for many coherence protocols in order to minimize on-chip hit latency. Although CMPs share this necessity, up to now few suitable proposals have been developed. The combination of resource scarcity and the common idea that multicast support requires a substantial amount of extra resources is responsible for this situation. In this work, we propose a new approach suitable for on-chip networks capable of managing multi-destination traffic via hardware in an efficient way with negligible complexity. We introduce a novel multicast routing mechanism, able to circumvent many of the limitations of conventional multicast schemes. Adaptive-tree multicasting is able to maintain correctness for multi-flit multicast messages without routing restrictions, while also coupling correctness and performance in a natural way. Replication restrictions not only guarantee the presence of enough resources to avoid deadlock, but also dynamically adapt tree shape to network conditions, routing multicast messages through non-congested paths. The performance results, using a state-of-the-art full system simulation framework, show that it improves the average full system performance of a CMP by 20% and network ED2P by 15%, when compared to a state-of-the-art router with conventional multicast support and similar implementation cost.