DINE

The Durham Interconnected Novel Environment (DINE) is a test production cluster aimed at testing different HPC, networking and storage technologies. It is now in the second generation.

To date, it has hosted BlueField-1, Rockport and BlueField-2 technologies.

Installed in 2019, DINE was the first UK AMD Rome based production cluster.

Specifications

The DINE cluster consists of 24 nodes, each equipped with a 200G BlueField-2 Data Processing Unit from NVIDIA. DPUs offload networking and storage workloads from the main CPU of the node, separating infrastructure processing from job processing.

Nodes

RAM per node

CPU(s)

Network Technology

24

512GB

2x AMD EPYC 7302 16-Core

Infiniband

Usage

If you do not already have an account on COSMA, please follow the instructions here, then request to join the project: do009.

The DINE cluster is part of the bluefield1 SLURM partition. Jobs should be submitted to the queue using #SBATCH -p bluefield1.

There is no direct ssh access to DINE nodes.

Known issues / notes

The bluefield nodes are automatically powered down when not in use, so when you submit your job you may see the status as CF (configuring) while your allocated nodes are booting.