The computing system Alps features the production partition Eiger, accessible via ssh from the front end ela.cscs.ch as eiger.cscs.ch. The software environment on the system is controlled using the Lmod framework, which provides a flexible mechanism to access compilers, tools and applications.

No modules are loaded by default at login: users need to load the module cray  first, then they will be able to load the modules available in the default Cray programming environment. Therefore users are invited to add the command module load cray to their scripts and workflows

Getting Started

Supported applications and libraries are built using the EasyBuild toolchains cpeAMD, cpeCray, cpeGNU and cpeIntel, which load compilers and libraries of the modules PrgEnv-aocc, PrgEnv-cray, PrgEnv-gnu and PrgEnv-intel respectively. The toolchain modules are immediately available to be loaded upon login on the system: only after loading the selected toolchain, you will be able to list with module avail and load with module load additional applications and libraries built with the currently loaded toolchain. Alps users can find more information on the Alps (Eiger) User Guide.

Parallel programs compiled with Cray-MPICH (the MPI library available on this system) must be run on the compute nodes using the Slurm srun command: running applications on the login nodes is not allowed, as they are a shared resource. Slurm batch scripts should be submitted with the sbatch command from the $SCRATCH folder: users are NOT supposed to run jobs from different filesystems, because of the low performance.

A simple Slurm job submission script would look like the following:

#!/bin/bash -l
  
#SBATCH --job-name=template
#SBATCH --time=01:00:00
#SBATCH --nodes=6
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=8
#SBATCH --account=<project>
#SBATCH --constraint=mc

export FI_CXI_RX_MATCH_MODE=hybrid
export MPICH_OFI_STARTUP_CONNECT=1
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} 
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export OMP_STACKSIZE=8M
export SRUN_CPUS_PER_TASK=${SLURM_CPUS_PER_TASK}
srun --cpu-bind=verbose,cores <executable>

The flag -l at the beginning allows you to call the module command within the script. The srun  option --cpu-bind binds MPI tasks to CPUs: in the template above, the keyword cores  will bind tasks to cores (note that if the number of tasks differs from the number of allocated cores, this can result in sub-optimal binding). We have also enabled the verbose mode with the keyword verbose, that will report in the Slurm output the selected CPU binding for all commands executed in the script. You can also set the SLURM_CPU_BIND  environment variable value to verbose to select verbose output. Please use alternatively the srun  option --hint=nomultithread to avoid extra threads with in-core multi-threading, a configuration that can benefit communication intensive applications: in this case, please remove --cpu-bind, see man srun for details. Please note as well:

  • the default OMP_STACKSIZE is small for the GNU compiler, therefore you may get a segmentation fault with multithreaded simulations: in this case, try to increase it as in the template above. The actual value of OMP_STACKSIZE at runtime will be limited by the free memory on the node, therefore you might get an error like libgomp: Thread creation failed: Resource temporarily unavailable if you request more memory than currently available
  • some applications might fail at runtime reporting an error related to FI_CXI_RX_MATCH_MODE. In this case, please add export FI_CXI_RX_MATCH_MODE=hybrid  as in the template above or export FI_CXI_RX_MATCH_MODE=software in your Slurm batch script. Other environment variables might be fine tuned, for instance FI_CXI_RDZV_THRESHOLD, FI_CXI_REQ_BUF_SIZE, FI_CXI_REQ_BUF_MIN_POSTED and FI_CXI_REQ_BUF_MAX_CACHED
  • The number of cpus per task specified for salloc or sbatch is not automatically inherited by srun and, if desired, must be requested again, either by specifying --cpus-per-task when calling srun, or by setting the SRUN_CPUS_PER_TASK environment variable

Please be reminded to include the active project that you would like this job to be charged for allocation. This can be done with the Slurm option  #SBATCH --account=<project>  in the submission script or as a flag with the srun command, i.e.  --account=<project>  or  -A <project> , where the string  <project>  is the ID of the active project. You also need to specify the  Slurm constraint #SBATCH --constraint=mc in the batch script or as a srun option ( --constraint=mc or -C mc).

Slurm supports the --mem option to specify the real memory required per node. For applications requiring more than 256 GB of memory per node, users should add the Slurm directive #SBATCH --mem=260G in their jobscript. See the sbatch man page for more details

Slurm batch queues 

Name of the queue

Max timeMax nodesBrief Description
debug30 min10Quick turnaround for test jobs (one per user)
normal24 h480Standard queue for production work
prepost30 min1High priority pre/post processing
low24 hours124Low priority queue

The list of queues and partitions is available typing sinfo or scontrol show partition. Note that not all groups are enabled on every partition, please check the AllowGroups entry of the command scontrol show partition. You can choose the queue where to run your job by issuing the Slurm directive --partition in your batch script: #SBATCH --partition=<partition_name>

The command sinfo -l provides a summary of the Slurm batch queues that is easy to visualize. Please check the other options of the command with  sinfo --help  .

Please check the Slurm man pages and the official documentation for further details on the scheduler. You will also find useful information in the corresponding section of the FAQ.

Interactive Computing with Jupyter Notebooks

You can access computing resources via your browser through a user interface based on Jupyter. This service is available at https://jupyter-eiger.cscs.ch. Further information is available on the Alps User Guide.

File Systems

The user scratch space /capstor/scratch/cscs/$USER available on the system can be reached with the environment variable $SCRATCH. File systems access type (read, write, None) from compute and login nodes is summarized below:


scratch/users/project/store
Compute nodesr+wr+wr+wr
Login nodesr+wr+wr+wr+w

Please carefully read the general information on file systems at CSCS, especially with respect to the soft quota and the cleaning policy enforced on scratch

Accounting data

The sacct command displays accounting data for jobs stored in the Slurm job accounting database. For example, sacct -j will display information about the specified jobid:

# sacct -j 33306
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
33306             0.slm     normal                   288  COMPLETED      0:0
33306.batch       batch                              288  COMPLETED      0:0
33306.extern     extern                              288  COMPLETED      0:0
33306.0           myexe                              288  COMPLETED      0:0
33306.1           myexe                              288  COMPLETED      0:0

# sacct -V
slurm 23.02.6

The format of the sacct command can be formatted with the --format flag and/or SLURM_TIME_FORMAT variable (man sacct), for example:

# SLURM_TIME_FORMAT=%Y/%m/%d-%H:%M:%S sacct \
  --format=jobid,jobname%15,start,end,ConsumedEnergy,Elapsed -j 33306

JobID                JobName               Start                 End ConsumedEnergy    Elapsed
------------ --------------- ------------------- ------------------- -------------- ----------
33306                  0.slm 2024/03/14-13:21:27 2024/03/14-13:23:39          86730   00:02:12
33306.batch            batch 2024/03/14-13:21:27 2024/03/14-13:23:39          86664   00:02:12
33306.extern          extern 2024/03/14-13:21:27 2024/03/14-13:23:39          86730   00:02:12
33306.0                myexe 2024/03/14-13:21:37 2024/03/14-13:22:38          39606   00:01:01
33306.1                myexe 2024/03/14-13:22:38 2024/03/14-13:23:38          39538   00:01:00

As described in the Slurm documentation, the <jobid>.batch step accounts for the resources needed by the jobscript before/after executing the srun commands, together with the resources used by the srun commands. The <jobid>.extern step accounts for the resources used by that job outside of slurm. Finally, this job called srun twice as can be seen in the job steps <jobid>.0 and <jobid>.1.