You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Clariden is a vcluster (versatile software-defined cluster) that's part of the Alps system.

A short summary of the hardware available in the nodes: 

PartitionNNodesGPUs per nodeGPUGPU memoryMax time
nvgpu184Nvidia A10080GB1 day
amdgpu128AMD Mi20064GB1 day
normal2000--1 day
clariden2000--30 minutes

More information on the partitions can be found with scontrol show partitions.

Access and Accounting

  • Clariden can be reached via ssh from the front end Ela ( `ssh <username>@ela.cscs.ch`). The access to CSCS services and systems requires users to authenticate using multi-Factor Authentication (MFA). Please, find more information here.
  • Account and Resources Management Tool
  • Usage policies

Running Jobs

  • Clariden uses the Slurm  workload manager to manage the job scheduling.
  • Some typical/helpful slurm commands are

    sbatch
    submit a batch script
    squeue
    check the status of jobs on the system
    scancel
    delete one of your jobs from the queue
    srunlaunch commands in an existing allocation
    srun --interactive --jobid <jobid> --pty bash
    start interactive session on an allocated node
  • Example of Slurm job script

    #!/bin/bash -l
    
    #SBATCH --job-name=<jobname>
    #SBATCH --time=00:15:00
    #SBATCH --nodes=1
    #SBATCH --ntasks-per-core=1
    #SBATCH --ntasks-per-node=4
    #SBATCH --cpus-per-task=16
    #SBATCH --partition=nvdgpu
    #SBATCH --account=<project>
    
    srun executable
  • Currently there's no accounting of the compute time, but it's expected to be setup.
  • Please note that the Slurm scheduling system is a shared resource that can handle a limited amount of batch jobs and interactive commands simultaneously. Therefore users are not supposed to submit arbitrary amounts of Slurm jobs and commands at the same time.

Files system

Software

  • No labels