Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

amdgpu
PartitionNNodesGPUs per nodeGPUGPU memoryMax time
nvgpunormal1813004Nvidia A10080GB1 dayGH20096GB12 hours8AMD Mi20064GB1 day
debugnormal2000-4-1 dayGH20096GBclariden2000--30 minutes

More information on the partitions can be found with scontrol show partitions.

...

It's possible to create a shh tunel tunnel to Clariden via Ela. The following must be add in your personal computer's ~/.ssh/config file (replacing <username> by your CSCS user name)

...

  • Clariden uses the Slurm  workload manager to manage the job scheduling.
  • Some typical/helpful slurm commands are

    sbatch
    submit a batch script
    squeue
    check the status of jobs on the system
    scancel
    delete one of your jobs from the queue
    srunlaunch commands in an existing allocation
    srun --interactive --jobid <jobid> --pty bash
    start interactive session on an allocated node


  • Example of Slurm job script

    Code Block
    languagebash
    #!/bin/bash -l
    
    #SBATCH --job-name=<jobname>
    #SBATCH --time=00:15:00
    #SBATCH --nodes=1
    #SBATCH --ntasks-per-core=1
    #SBATCH --ntasks-per-node=4
    #SBATCH --cpus-per-task=16
    #SBATCH --partition=nvdgpu
    #SBATCH --account=<project>
    
    srun executable
    Currently there's no accounting of the compute time, but it's expected to be setup.


  • Please note that the Slurm scheduling system is a shared resource that can handle a limited amount of batch jobs and interactive commands simultaneously. Therefore users are not supposed to submit arbitrary amounts of Slurm jobs and commands at the same time.

...