Using Acceleration as a Service: Rethinking HPC Infrastrucutre in the AI Era
Presenter
September 13, 2025
Abstract
Scientific computing has undergone a fundamental transformation over the past decade. GPU-accelerated computing, which in 2015 was still at the fringes of supercomputing, has become the norm today. The recent advances — indeed, the revolution — in AI, together with the broader digital transformation of science, have placed HPC at the very center of scientific discovery. No AI-for-Science project can succeed without it.
This tremendous opportunity, however, also poses a threat to HPC centers that fail to adapt to the needs of scientists. The success of AI in science depends not only on access to sufficient compute power but equally on data — and data requires curation and domain-centric workflows. Traditional HPC operational models are ill-suited for this.
In a recent publication [1], Hoefler et al. introduced the concept of Acceleration as a Service (XaaS), a step in the right direction that leverages container-based deployment of HPC workloads. In this presentation, I will go a step further and demonstrate how supercomputing infrastructures can evolve into service-oriented architectures. Drawing on principles from cloud computing and exploiting features of modern high-performance networks, we can simultaneously serve operational weather prediction, climate simulation, AI, traditional HPC, and large-scale experimental and observational instruments — all through specialized, elastic platforms that do not compromise scalability or performance.
[1] T. Hoefler et al., “XaaS: Acceleration as a Service,” arXiv preprint, arXiv:2401.04552, Jan. 2024.