The cluster is specialised to run Machine Learning/Artificial Intelligence and HTC jobs.
The Lyra HPC cluster has an architecture designed to run Machine Learning/Artificial Intelligence and High Throughput Computing workloads. The cluster is physically composed of 3 management nodes and 20 compute nodes deployed as an hyperconverged infrastructure.
Virtual compute nodes have been deployed on the hyperconverged infrastructure. The virtual compute nodes are interconnected by a virtual VXLAN-EVPN network based on 2x 50 GbE redundant links. The storage is based on the CEPH filesystem solution, offering the storage spaces mounted in the virtual compute nodes. With some parameters tweaking, the virtual HPC cluster runs synthetic benchmarks close to the maximal theoretical performances at the CPU, GPU and memory levels.
Each virtual compute node features the following specifications:
In total, researchers will benefit from 1280 CPU cores, 727040 CUDA cores and 5120 GB RAM. The user home quota is set at 100 GB and the user global scratch quota is set at 5 TB.
In addition, the Lyra cluster is entirely managed via the Infrastructure-as-code process. It can therefore be redeployed from scratch in a finger snap, and it can be adapted effortlessly, such as supporting multiple VM types to accommodate more heterogeneous workloads or offering temporary specialised working environments.
The cluster was funded by the Walloon Region (convention n° 1910167) with a budget of 700 K€.
The Lyra cluster is hosted in the new ULB A6K datacentre. With two distinct power sources, each protected by UPS against short power cuts and soon by a power generator against long power cuts, the racks ensure continuous power supply. The highly efficient cooling system ensures constant cooling temperature no matter the workload on the cluster. Regarding security, all the possible protections have been deployed at the physical and logical levels.