COSMA News

22/11/24: dataweb with Globus available for general use

8/11/24: AMD MI300X system available for general use

25/10/24: COSMA Gitlab operational after upgrade

8/10/24: COSMA maintenance period ends (2 days)

6/9/24: cosma AI assistant is back

4/9/24: AMD GPUs available in the cosma8-shm2 queue (MI100 and MI200)

20/8/24: DINE (bluefield1) partition now partly available

9/8/24: The old cosma queue (COSMA5 nodes) returns to service, along with login5 (login5a, login5b)

30/7/24: Homespace snapshotting enabled

12/7/24: COSMA reinstalled with Rocky9 and OpenStack

1/7/24: COSMA maintenance period

5/6/24: Intel Sapphire Rapids node with 2x Ponte Vecchio GPUs available for use

5/6/24: NVIDIA Grace Hopper nodes available for use (2x)

5/6/24: cosma8-dine2 nodes now with composable GPUs (up to 8 NVIDA A30s)

1/5/24: cosma8-dine2 nodes made available for test access

16/4/24: New AMD Bergamo nodes introduced for COSMA5, plus new login5c

5/4/24: COSMA6 reused in Argentina

4/4/24: New VAST storage system available on DINE as /dine

3/4/24: Default python module version changed to 3.6.5

3/4/24: COSMA back after unexpected Building Management System outage

2/4/24: We may start using news again!

28/3/24: VAST storage system installed

27/3/24: New Cerio composable system cabled up

27/2/23: This news feed is now no longer used!

9/2/22: COSMA back in production

7/2/22: COSMA downtime starts

18/12/21: MPPHEA paper

2/12/21: /cosma8 back in production

29/11/21: /cosma8 storage issues

6/10/21: COSMA uptime

4/10/21: COSMA downtime

1/10/21: COSMA8 commissioning period officially ends

1/7/21: COSMA8 commissioning period has begun

20/6/21: COSMA8 HPL run gives 1.37PF (at 280kW)

20/4/21: AMD Milan node (with 128 cores, 1TB RAM and MI100 GPU) goes live

10/4/21: COSMA8 power-on

1/4/21: COSMA8 integration starts

1/3/21: COSMA becomes a Dell Centre of Excellence

4/2/21: COSMA back from downtime

28/1/21: COSMA network link severed between Durham and Leeds

26/1/21: Updated OneAPI module installed

16/12/20: DiRAC-3 system ordered

5/10/20: COSMA downtime starts

21/9/20: COSMA8 Compute nodes powered up, with novel on-chip cooling

7/8/20: COSMA8 service nodes brought into production

3/6/20: COSMA downtime completed

15/5/20: COSMA seems to have survived (so far) the world-wide HPC attacks

14/5/20: All users must regenerate SSH keys and upload to SAFE

17/4/20: GCC 9.3 and Intel 2020 (update 1) compilers now available for use

1/4/20: BlueField cluster ready for users (first 4 nodes)

16/3/20: x2go installed on login nodes to aid remote working during COVID-19

2/3/20: 16-node BlueField delivered and racked (awaiting power cables).

28/2/20: New database server for virgodb delivered.

5/2/20: Permanent host for V100 GPU cards identified

5/2/20: New COSMA5 storage online - from nearly 30kW down to 1.5kW

5/2/20: COSMA is alive again!

3/2/20: COSMA in downtime… back soon

25/11/19: New COSMA6 storage in service across all of COSMA

19/11/19: New COSMA6 storage in service on COSMA6 nodes

11/11/19: New COSMA6 storage migration ongoing

12/10/19: COSMA back into production

7/10/19: COSMA downtime has started

24/9/19: V100 GPUs are ready for use

23/9/19: Advance warning: COSMA downtime starting 7th October

20/9/19: Two NVIDIA V100 GPUs (32GB) have arrived, will be installed shortly.

6/9/19: Lydia Heck’s retirement day today! She departs on the 11th.

30/8/19: New COSMA5 storage has arrived. This will be installed and put into production shortly

22/7/19: COSMA awarded funding to replace old /cosma5 storage

15/7/19: COSMA4 /gpfs file system powered down for good!

12/7/19: At risk period, 16th July (UPS upgrade)

3/7/19: COSMA login nodes have a fast 10GBit/s link to the outside world

20/6/19: Old data transfer nodes retired from service

7/6/19: New data transfer node with 20GBit/s connectivity added, dataweb, funded by a CO2 reduction scheme

7/6/19: New login node added, login7c

6/6/19: COSMA back in production

3/6/19: COSMA Downtime

2/5/19: Sheffield GPU Hackathon in August: http://gpuhack.shef.ac.uk/

2/5/19: COSMA5 queues reopened after hardware failure a few hours earlier

22/3/19: 152 new COSMA nodes added to COSMA7 queues.

20/3/19: COSMA7 storage outage due to failed PDU. Now operational without data loss.

4/3/19: 152 new COSMA nodes functional and undergoing tests

12/2/19: COSMA downtime from 18-22nd Feb for installation of new hardware

1/2/19: COSMA awarded funding to replace ageing COSMA6 storage

4/1/19: COSMA7 now expanded to 300 nodes, 230TB RAM.

2/1/19: COSMA welcomes users to a new year!

21/12/18: COSMA5 storage back to full redunancy in time for Christmas

19/12/18: New COSMA7 nodes in test queue, undergoing testing

18/12/18: COSMA5 queue back up after storage hardware failure: no data lost (18 HDDs replaced)

11/12/18: COSMA7 expanding to 300 nodes shortly

10/12/18: Carbon fund application successful: 5 servers will be replaced with one new one

5/12/18: COSMA selected for testing proof of concept systems

19/11/18: COSMA7, part 2, the DiRAC 2.5y Memory Intensive system is due for final installation this week.