Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • We aim to keep planned disruptive updates (meaning the services may potentially be inaccessible) to Tuesday Wednesday morning (0800 - 1200 CET).
  • Exceptional and non-disruptive updates may happen outside of this period.

Access and Accounting

  • Clariden can be reached via ssh from the frontend Ela ( `ssh <username>@ela.cscs.ch`). The access to CSCS services and systems requires users to authenticate using multi-Factor Authentication (MFA). Please, find more information here.

  • Access to Clariden is managed through Waldur (https://portal.cscs.ch/). For SwissAI, your access to Clariden is managed by your respective vertical/horizontal project administrators and project managers (typically your PI).

  • Usage policies

Connecting to Clariden

...

  • Clariden uses the Slurm  workload manager to manage the job scheduling.
  • Some typical/helpful slurm commands are

    sbatch
    submit a batch script
    squeue
    check the status of jobs on the system
    scancel
    delete one of your jobs from the queue
    srunlaunch commands in an existing allocation
    srun --interactive --jobid <jobid> --pty bash
    start interactive session on an allocated node


  • Example of Slurm job script

    Code Block
    languagebash
    #!/bin/bash -l
    
    #SBATCH --job-name=<jobname>
    #SBATCH --time=00:15:00
    #SBATCH --nodes=1
    #SBATCH --ntasks-per-core=1
    #SBATCH --ntasks-per-node=4
    #SBATCH --cpus-per-task=16
    #SBATCH --account=a-a**
    
    srun executable


  • You must specify the account attached to the project you are in. You can do this either in the submission script as above, or on the command line, for example: sbatch -A a-a11 myscript.sh
    • Your project account can be identified through Waldur. Simply go to https://portal.cscs.ch/ → Resources → HPC → find the vertical/horizontal you are in, and click on it to see more detailed information.
  • Please note that the Slurm scheduling system is a shared resource that can handle a limited amount of batch jobs and interactive commands simultaneously. Therefore users are not supposed to submit arbitrary amounts of Slurm jobs and commands at the same time.

...

Support

Your first port-of-call for support should be to check for related topics in the #cscs-users channel in the SwissAI slack space (swissai-initiative.slack.com). We additionally provide a more general slack space (cscs-users.slack.com) where CSCS engineers are also present. We note that while support may be offered in this slack space, it is not an official support channel, however CSCS engineers are very helpful in this space. If you can't resolve your issue through the above means, the best and recommended way to get support is by creating a ticket on our helpdesk (https://jira.cscs.ch/plugins/servlet/desk). We endeavor to respond to your tickets within ~3H.