PhD Students


  • Rotary Router: an Efficient Architecture for CMP Interconnection Networks

    Pablo Abad, Valentin Puente, Jose Angel Gregorio, Pablo Prieto , International Symposium on Computer Architecture, ISCA

    Details Pdf Slides Patent DOI

  • Flask Coherence: A Morphable Hybrid Coherence Protocol to Balance Energy, Performance and Scalability

    Lucia G. Menezo, Valentin Puente, Jose Angel Gregorio , In International Symposium On High Performance Computer Architecture, IEEE., HPCA

    Details Pdf Slides Project DOI

  • Architecture of a Cortex Inspired Hierarchical Event Recaller

    Valentin Puente , ArXiv, ArXiv

    Details Pdf Arxiv

  • AC-WAR: Architecting the Cache Hierarchy to Improve the Lifetime of an Non-volatile Endurance-limited Main Memory

    Pablo Abad, Pablo Prieto, Valentin Puente, Jose Angel Gregorio , Transactions on Parallel and Distributed Systems, IEEE., TPDS

    Details Project DOI

  • 3D Stacking of High-Performance Processors

    Philip Emma, Alper Buyuktosunoglu, Michael Healy, Krishnan Kailas, Valentin Puente, Roy Yu, Allan Hartstein, Pradip Bose, Jaime Moreno , In International Symposium On High Performance Computer Architecture, IEEE., HPCA

    Details Pdf DOI

  • ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architectures

    Javier Merino, Valentin Puente, Jose Angel Gregorio , International Conference on High Performance Computer Architecture, HPCA

    Details Pdf DOI

  • Evaluation of 'Edge Computing' architectures for the inference of large language models

    This project evaluates how small language models perform on edge devices by comparing inference across CPU, GPU, and NPU hardware. It focuses on trade-offs in latency, throughput, memory usage, and energy efficiency under realistic deployment constraints. The goal is to identify optimal backend choices and key bottlenecks to guide efficient on-device LLM deployment.

  • Rethinking CPU Backend for emerging general-purpose applications

    This project proposes redesigning the backend of general-purpose processors to execute AI workloads more efficiently, supporting Europe's goals of AI competitiveness and digital sovereignty. Based on profiling of 31 deep learning models, it identifies vector execution resources in current CPUs as the main performance bottleneck. The central hypothesis is that a modular, slice-based backend architecture can scale SIMD throughput more effectively than conventional monolithic designs while preserving CPU flexibility and broad applicability.

Resources

  • Computing Resources

    The local available computing resources and their current operational status

  • Tools

    Computer architecture modeling tools developed by our group

Teaching

Contact info

  • Computer Engineering
  • Facultad de Ciencias
    Avda. Los Castros s/n
    39005 Santander, Cantabria
    (SPAIN)