AMD GPU testbeds
COSMA contains multiple AMD GPU systems to allow code development, porting and performance benchmarking.
Specifications
Usage
If you do not already have an account on COSMA, please follow the instructions here, then request to join the project: do018.
MI100 node is accessible through the
cosma8-shm2queue. Use--nodelist=ga004to ensure allocation to the MI100 node.MI210 nodes are accessible through the
cosma8-shm2queue. Use--exclude=ga004to ensure allocation to the MI210 nodes.MI300X node is accessible through the
mi300xqueue.MI300A node is accessible through direct ssh. From a login node, use
ssh ga008.
Known issues / notes
The AMD ROCm software stack is installed. ROCm 6.3.0 is available at /opt/rocm-6.3.0/bin/hipcc
CUDA code must be converted to HIP using the hipify script provided with ROCm.