Welcome to the COSMA support documentation!
Welcome to the The COSmology MAchine (COSMA) support pages
For generic information about COSMA please see COSMA section which provides an overview of the COSMA HPC system.
Check out the Getting an account section to get started!
Note
COSMA is operating normally
Note
Downtimes: COSMA has 3 periods of scheduled downtime per year, lasting up to a week, though typically the affected period is shorter. These are during the first full weeks of February, June and October (every 4 months). Current scheduled periods are: - 7-11th October 2024 - 3-7th February 2025 - 2-6th June 2025
Note
These pages are under active development. If you see any out-of-date or incorrect information, please let us know!
Contents
- Contact Us
- Rocky9 Cluster upgrade
- Frequently Asked Questions
- How do I best formulate my questions to the support staff?
- Who should I consult for technical problems? Which mailing list should I use?
- Who can get an account on cosma?
- What should I do to get an account?
- Can my external collaborator get an account too?
- How do I log in to COSMA?
- Can I use VNC? Where should I run the server?
- What are the main differences between cosma5, cosma7 and cosma8?
- Why is there a cosma5, 7 and 8, but no cosma6?
- How do I decide which COSMA to use?
- How do I copy (large) files to/from cosma?
- Where is my home directory?
- Are there disk usage quotas?
- Can I print from cosma?
- What are these “modules”?
- Which modules should I load? Are these different on the different Cosma’s? How do I find which modules are available?
- What are the recommended modules for running Gadget? Arepo?
- How do I adapt the Makefile to be consistent with these modules?
- How do I load modules automatically on login?
- I want to perform a large simulation: Who do I ask? Can I just run?
- How do I write the submission script?
- What are the main commands to interact with a batch job?
- How do I make sure my job runs on full nodes?
- How many cores can I reasonably request?
- What determines when my job starts? How do I know the job is running? How can I kill it?
- Where should I put the data? Is that backed-up?
- Do I need to worry about where the initial conditions are stored?
- How do I run an embarrassingly parallel job?
- Can I log-on to a compute node? Should I?
- How do I find how busy the system is?
- What is “back-fill”? How do I exploit this?
- I want to analyse a simulation, but need a lot of memory: how do I do this?
- I want to visualise my data: what can I use?
- I am developing a code and need to profile and debug it
- What are the profiling tools?
- My short tests take a long time to start: can I do something about this?
- Can I run interactively on a node? How do I do this?
- Tell me about parallel compile. Can I compile on cosma6 and then run on cosma7?
- Locale errors
- Does mounting cosma via sshfs on your local machine hog any Cosma resources?
- How do I find out the disk quota I have? How can I get more?
- My jobs die without creating an output
- I am in many groups, how do I change my default one?
- Using the snap file systems (/snap7, /snap8)
- I am a PI - how do I use my time?
- As a PI, how do I manage storage allocations?
- Usage of the Rockport Network System
- VIRGO Disk Usage
- Code-specific information
- Intel MPI and Lustre
- Nbodykit
- COSMA presentations
- Archival and retrieval
- The module environment
- Spack Package Environment
- GitLab
- Using Python on COSMA
- Jupyter Hub
- MPI Hints
- The Lustre file system
- Quotas
- File system access control lists
- Graphical Access
- Slurm
- X11 forwarding for debugging
- Direct compute node access
- Reservations
- Example scripts
- Large job over many nodes
- Many similar jobs simultaneously
- Many single core jobs on a single node
- Many nodes with 2 jobs per node (which can then make use of the additional cores using threading/OPENMP)
- Using the cordelia queue to submit single core jobs without taking up a whole node
- Using non-uniform resources per node in a single job (heterogenous jobs)
- Using DDT without direct compute node access
- SLURM hints and tips
- COSMA job queues
- System Usage
- GPUs
- Data Transfer