This paper was accepted at ICS ’23, the 37th International Conference on Supercomputing that was held from 21 to 23 June in Orlando, Florida, USA. This paper was prepared by ETH Zürich in collaboration with OpenCore.
Abstract
Serverless functions provide elastic scaling and a fine-grained billing model, making Function-as-a-Service (FaaS) an attractive programming model. However, for distributed jobs that benefit from large-scale and dynamic parallelism, the lack of fast and cheap communication is a major limitation. Individual functions cannot communicate directly, group operations do not exist, and users resort to manual implementations of storage-based communication. This results in communication times multiple orders of magnitude slower than those found in HPC systems. We overcome this limitation and present the FaaS Message Interface (FMI). FMI is an easy-to-use, high-performance framework for general-purpose point-to-point and collective communication in FaaS applications. We support different communication channels and offer a model-driven channel selection according to performance and cost expectations. We model the interface after MPI and show that message passing can be integrated into serverless applications with minor changes, providing portable communication closer to that offered by high-performance systems. In our experiments, FMI can speed up communication for a distributed machine learning FaaS application by up to 162x, while simultaneously reducing cost by up to 397 times.
Authors
Marcin Copik (ETH Zürich), Roman Böhringer (OpenCore), Alexandru Calotoiu (ETH Zürich), Torsten Hoefler (ETH Zürich)