COSMA GPU Nodes
Last updated: 21/01/2026
Summary of Available Nodes
GPU |
Count |
Node(s) |
Access |
Partition |
Account |
Interconnect |
Status |
|---|---|---|---|---|---|---|---|
V100 |
6x |
gn001 |
SSH |
- |
do016 |
- |
✓ |
A30 |
8x total |
gc001-008 |
Slurm |
dine2 |
do015 |
CerIO (composable) |
Draining |
A100 |
3x total |
mad04-06 |
Slurm / SSH |
cosma8-shm |
do016 |
Liqid (composable) |
✓ |
GH200 |
2x |
gn002-003 |
Slurm / SSH |
gracehopper |
do016 |
- |
✓ |
H100 NVL |
1x |
gn004 |
SSH |
intelhopper (down) |
do016 |
- |
✓ (SSH only) |
MI100 |
1x |
ga004 |
Slurm |
cosma8-shm2 |
do018 |
- |
✓ |
MI210 |
2x each |
ga005-006 |
Slurm |
cosma8-shm2 |
do018 |
- |
ga006 draining |
MI300X |
8x |
ga007 |
Slurm |
mi300x |
do018 |
- |
✓ |
MI300A |
4x |
ga008 |
SSH |
mi300a (down) |
do018 |
- |
Down |
PVC |
2x |
gi001 |
- |
- |
do017 |
- |
Dead |
GPU Specifications (Theoretical)
GPU |
Architecture |
FP64 TFLOPS |
VRAM (GB) |
Memory BW (GB/s) |
Backend |
|---|---|---|---|---|---|
V100 |
Volta |
7.0 |
32 |
900 |
CUDA |
A30 |
Ampere |
5.2 |
24 |
933 |
CUDA |
A100 |
Ampere |
9.7 |
40 |
1555 |
CUDA |
GH200 |
Hopper |
34 |
96 |
4000 |
CUDA |
H100 PCIe |
Hopper |
26 |
80 |
2000 |
CUDA |
H100 SXM |
Hopper |
34 |
80 |
3350 |
CUDA |
H100 NVL |
Hopper |
34 |
94 |
3938 |
CUDA |
MI100 |
CDNA |
11.5 |
32 |
1200 |
HIP |
MI210 |
CDNA2 |
22.6 |
64 |
1600 |
HIP |
MI300X |
CDNA3 |
81.7 |
192 |
5300 |
HIP |
MI300A |
CDNA3 (APU) |
61.3 |
128 |
5300 |
HIP |
PVC |
Xe HPC |
52.4 |
128 |
3280 |
SYCL |
Detailed Node Information
NVIDIA CUDA Systems
Node |
GPUs |
Host |
Access |
Account |
Arch |
Notes |
|---|---|---|---|---|---|---|
gn001 |
10x V100 (32GB) |
768GB, 2x Xeon Gold 5218 |
SSH |
do016? |
sm_70 |
|
gc001-008 |
0-8x A30 (composable) |
2TB, 64 cores (Sapphire Rapids) |
|
do015 |
sm_80 |
CerIO fabric; 8 GPUs total across 8 nodes |
mad04, mad05 |
0-3x A100 40GB (composable) |
4TB, 128 cores |
|
do016? |
sm_80 |
Liqid fabric; default 1 GPU each |
mad06 |
0-3x A100 (composable) |
1TB, 128 cores (Milan-X, 768MB L3) |
SSH |
do016? |
sm_80 |
|
gn002 |
GH200 |
ARM (aarch64) |
SSH |
do016 |
sm_90 |
ARM host: requires native compilation |
gn003 |
GH200 |
ARM (aarch64) |
|
do016 |
sm_90 |
No system cmake; CUDA 13.0 at |
gn004 |
1x H100 PCIe |
x86 (Intel) |
SSH |
do016? |
sm_90 |
AMD HIP/ROCm Systems
Node |
GPUs |
Host |
Access |
Account |
Arch |
Notes |
|---|---|---|---|---|---|---|
ga004 |
1x MI100 |
1TB, 128 cores (Milan 7713) |
|
do018? |
gfx908 |
|
ga005, ga006 |
2x MI210 each |
1TB, 64 cores (EPYC 7513) |
|
do018? |
gfx90a |
Use |
ga007 |
8x MI300X |
- |
|
do018 |
gfx942 |
|
ga008 |
4x MI300A |
500GB shared |
SSH |
do018 |
gfx942 |
APU: CPU/GPU share physical RAM |
Note: ROCm 6.3.0 installed system level on AMD nodes at /opt/rocm-6.3.0/.
Intel Systems
Node |
GPUs |
Access |
Account |
Status |
|---|---|---|---|---|
gi001 |
2x Ponte Vecchio |
- |
do017 |
Dead |
Note: Likely to stay dead for now!
Compilers
System |
Compiler |
Location |
Notes |
|---|---|---|---|
CUDA (x86, sm_80+) |
nvcc 13.0 |
|
sm_80, sm_90 |
CUDA (GH200) |
nvcc 13.0 |
|
ARM host: |
HIP (AMD nodes) |
hipcc 6.3.0 |
|
System-level: no module needed |
Note: The hipcc/6.0amd, hipcc/6.3amd, hipcc/7.0amd modules only set CC/CXX environment variables for dependent modules (e.g. MPI). ROCm is already available at system level.
Project Codes (from SAFE)
Code |
Systems |
|---|---|
do015 |
DINE2 (A30s) |
do016 |
NVIDIA GPUs (V100, A100, GH200, H100), cosma8-shm |
do017 |
Intel GPUs (dead) |
do018 |
AMD GPUs (MI100, MI210, MI300X, MI300A) |
do009 |
DINE, miscellaneous |
Access/Useful Commands
Check partition and node status
sinfo -a # list all partitions and statuses
sinfo -p cosma8-shm2 -N -l # detailed node info for a partition
squeue -p mi300x # show jobs running on partition
scontrol show partition=<partition_name> | grep AllowAccounts
Check access permissions
id # show groups you belong to
scontrol show partition= | grep AllowAccounts # show accounts that can use partitions
GPU Availability:
# NVDIA nodes
nvidia-smi
# AMD nodes
rocm-smi
rocminfo
Can use in slurm script to check GPU present.
Slurm submission
# by partition
sbatch -p cosma8-shm -A do016 script.sh # A100 on cosma8-shm
# by specific node
sbatch --nodelist=ga004 script.sh # ga004
# interactive session
srun -p mi300x -A do018 -t 10 --pty /bin/bash # mi300x
Direct SSH access
# From a COSMA login node:
ssh gn001 # V100s
ssh gn002 # Grace-Hopper (ARM)
ssh gn004 # H100
ssh ga008 # MI300A
ssh mad06 # A100 + Milan-X