Categories
Publications

RED-SEA: Network Solution for Exascale Architectures

This paper was accepted for the special session: “European Projects in Digital Systems Design (EPDSD) at the 25th Euromicro Conference on Digital System Design (DSD) held 31/08-02/09/2022 in Gran Canaria, Spain. This paper is an overall presentation of the RED-SEA project’s goals and approach. It was prepared by INFN in collaboration with all project partners.

Abstract

In order to enable Exascale computing, next generation interconnection networks must scale to hundreds of thousands of nodes, and must provide features to also allow the HPC, HPDA, and AI applications to reach Exascale, while benefiting from new hardware and software trends. RED-SEA will pave the way to the next generation of European Exascale interconnects, including the next generation of BXI, as follows: (i) specify the new architecture using hardware-software co-design and a set of applications representative of the new terrain of converging HPC, HPDA, and AI; (ii) test, evaluate, and/or implement the new architectural features at multiple levels, according to the nature of each of them, ranging from mathematical analysis and modeling, to simulation, or to emulation or implementation on FPGA testbeds; (iii) enable seamless communication within and between resource clusters, and therefore development of a high-performance low latency gateway, bridging seamlessly with Ethernet; (iv) add efficient network resource management, thus improving congestion resiliency, virtualization, adaptive routing, collective operations; (v) open the interconnect to new kinds of applications and hardware, with enhancements for end-to-end network services – from programming models to reliability, security, low- latency, and new processors; (vi) leverage open standards and compatible APIs to develop innovative reusable libraries and Fabrics management solutions.

Authors

Andrea Biagioni (INFN), Paolo Cretaro (INFN), Ottorino Frezza (INFN), Francesca Lo Cicero (INFN), Alessandro Lonardo (INFN), Michele Martinelli (INFN), Pier Stanislao Paolucci (INFN), Elena Pastorelli (INFN), Francesco Simula (INFN), Matteo Turisini (INFN), Piero Vicini (INFN), Roberto Ammendola (INFN), Pascale Bernier-Bruna (Atos), Claire Chen (Atos), Said Derradji (Atos), Stephane Guez (Atos), Pierre-Axel Lagadec (Atos), Gregoire Pichon (Atos), Etienne Walter (Atos), Gaetan De Gassowski (CEA), Matthieu Hautreaux (CEA), Stephane Mathieu (CEA), Gilles Moreau (CEA), Marc Perache (CEA), Hugo Taboada (CEA), Torsten Hoefler (ETH Zürich), Timo Schneider (ETH Zürich), Matteo Barnaba (Exact Lab), Giuseppe Piero Brandino (Exact Lab), Francesco De Giorgi (Exact Lab), Matteo Poggi (Exact Lab), Iakovos Mavroidis (Exapsys), Yannis Papaefstathiou (Exapsys), Nikolaos Tampouratzis (Exapsys), Benjamin Kalisch (Extoll), Ulrich Krackhardt (Extoll), Mondrian Nuessle (Extoll), Pantelis Xirouchakis (Forth), Vangelis Mageiropoulos (Forth), Michalis Gianioudis (Forth), Harisis Loukas (Forth), Aggelos Ioannou (Forth), Nikos Kallimanis (Forth), Nikos Chrysos (Forth), Manolis Katevenis (Forth), Wolfang Frings (Julich Research Centre), Dominik Gottwald (Julich Research Centre), Felime Guimaraes (Julich Research Centre), Max Holicki (Julich Research Centre), Volker Marx (Julich Research Centre), Yannik Muller (Julich Research Centre), Carsten Clauss (ParTec), Hugo Falter (ParTec), Xu Huang (ParTec), Jennifer Lopez Barillao (ParTec), Thomas Moschny (ParTec), Simon Pickartz (ParTec), Francisco J. Alfaro (University of Castilla-La Mancha), Jesus Escudero-Sahuquillo (University of Castilla-La Mancha), Pedro Javier Garcia (University of Castilla-La Mancha), Francisco J. Quiles (University of Castilla-La Mancha), Jose L. Sanchez (University of Castilla-La Mancha), Adrián Castelló (Universidad Politecnica de Valencia), Jose Duro (Universidad Politecnica de Valencia), Maria Engracia Gomez (Universidad Politecnica de Valencia), Enrique Quintana (Universidad Politecnica de Valencia), Julio Sahuquillo (Universidad Politecnica de Valencia), Eugenio Stabile (Universidad Politecnica de Valencia).

DOI: 10.1109/DSD57027.2022.00100

Paper>>

Open-access pre-print>>