Alps Santis (Climate and Weather Platform - CWP)

Santis (santis.alps.cscs.ch) is the vCluster of the User Lab deployed on the Alps infrastructure serving as part of the climate and weather platform. It's name derives from the highest mountain Säntis in the Alpstein massif of northeastern Switzerland.

Latest Updates

Maintenances

Regular activities
- When: Wednesday mornings from 08h00 - 12h00 CET
- What: Mainly adjustment of service deployments on Santis.
- Potential service interruptions can occur during these activities.
Exceptional activities
- When: Anytime
- What: E.g., urgent security patches or work affecting the larger infrastructure
- Announcement via mailing list, status page, etc.

Change log

No recent updates.

Known issues

ICON jobs

Jobs hanging in I/O
- Certain ICON jobs seem to encounter an issue when restarted from a previous snapshot. We are investigation possible root causes.
Jobs hanging w/o crashing for specific node counts
- Currently observed with coupled ICON simulations.
- Try setting these two ENV variables for the libfabric of the SlingShot high-speed network

export FI_CXI_OFLOW_BUF_COUNT=10 
# Specifying the number of CXI overflow buffers allocated to hold unexpected messages. Each buffer holds FI_CXI_OFLOW_BUF_SIZE bytes. Only applies to Slingshot 11.
export FI_MR_CACHE_MAX_COUNT=0
# Disabling CXI provider memory registration (MR) caching.

Cluster specifications

Hardware

All nodes are identical, with 4 Grace-Hopper (GH) modules per node. Specifically:

User Access Node (UAN):
- X repurposed GH compute nodes that serve as login nodes (santis-ln00[1-4])
Compute Node (CN):
- The number of compute nodes will change over time.
  You can get an up-to-date number using the command sinfo -s on the UAN.

A GH module consists of:

Name	Type	Compute	Memory
Grace	CPU	72 ARM cores	128GB LPDDR
Hopper	GPU	-	96GB HBM3

For more information, please also have a look at https://www.cscs.ch/computers/alps.

In the configuration of Santis, a GH node has approximately 800GB of free, unified (CPU + GPU) memory accessible.
Please note, the complete memory is accessible from all modules.

The ARM cores of a GH module have the following specifications:

Arm V9.0 ISA compliant aarch64 (Neoverse V2 “Demeter”architecture)
Full SVE-2 Vector Extensions support, inclusive of NEON instructions
Supports 48-bit virtual and 48-bit physical address space

The Hopper GPUs are using NVLINK, providing all-to-all cache-coherent memory between all host and device memory.

File systems

The following mount points can be found on Santis:

Mount point	Environment variable	File system	Features
/users/$USER	$HOME	NFS	Snapshot, Backup
/capstor/scratch/cscs/$USER	$SCRATCH	Lustre	-
/capstor/store/cscs/userlab/<GROUP_ID>		Lustre	Backup

NB: On $SCRATCH a quota of 150TB and 1M inodes (files, folders) is applied. These are implemented as soft quotas, i.e., upon reaching either limit, you will be given a grace period of 1 week before write access to $SCRATCH is blocked for your user.
( You can still submit jobs )
Please make sure to check your quota regularly using the quota command; it is available on the login nodes of Säntis as well as on the frontend ela.cscs.ch.

Slurm configuration

The following partitions are configured:

Partition	Max time	Max nodes	Comments
normal	24h		Standard queue for production work
debug	30 min	8	Quick turnaround for development; 1 job per user
xfer	24h	1	Internal transfer of data in between file systems and/or clusters

NB: Nodes are not shared and at least 1 node must be allocated for your job. (exception is the xfer queue)

Task/Thread allocation on a node:

Threads (e.g. OpenMP) are placed consecutively on the cores.
Tasks (e.g. MPI ranks) are placed in round-robin fashion between the 4 modules of the node.

Oversubscription of GPU cards

If you want to share GPU card(s) between multiple MPI ranks, you currently need to start multi-process daemon on the node yourself.
To do so, you need to use a simple wrapper script:

CUDA MPS wrapper script

#!/bin/bash
# Example mps-wrapper.sh usage:
# > srun [srun args] mps-wrapper.sh [cmd] [cmd args]
export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps
export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log
export CUDA_VISIBLE_DEVICES=$(( SLURM_LOCALID % 4 ))
# Launch MPS from a single rank per node
if [ $SLURM_LOCALID -eq 0 ]; then
    CUDA_VISIBLE_DEVICES=0,1,2,3 nvidia-cuda-mps-control -d
fi
# Wait for MPS to start
sleep 5
# Run the command
"$@"
# Quit MPS control daemon before exiting
if [ $SLURM_LOCALID -eq 0 ]; then
    echo quit | nvidia-cuda-mps-control
fi

and run your code using the following sample slurm script:

Bash submission script

#!/bin/bash -l
#SBATCH --job-name=<job_name>
#SBATCH --time=01:30:00 #HH:MM:SS
#SBATCH --nodes=2
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=32 #32 MPI ranks per node
#SBATCH --account=<account>
#SBATCH --hint=nomultithread
#SBATCH --hint=exclusive
 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export MPICH_MALLOC_FALLBACK=1
 
ulimit -s unlimited
 
srun --cpu-bind=socket ./mps-wrapper.sh <code> <args>

User environments

User environments (uenv) are used to providing programming environments and applications on the system. Please refer to the uenv user environments documentation on this knowledge base for more detailed information on how to use the UENV tooling on the system.

User environments might not be tagged for the system Santis, but for other vClusters.

If there is a uenv image on a different vCluster that you want to use on Santis, send a request to have it deployed on Santis.

It is also possible to directly use uenv images built for other systems using the @system syntax, for example:

uenv image find @todi

The above command would list all images initially prepared for the vCluster Tödi on Santis.

Container engine

The Container Engine (CE) is available on the system: this is a toolset is designed to enable computing jobs to seamlessly run inside Linux application containers, thus providing support for containerized user environments: please see the dedicated page to use CE.

Content

Space Tools

Latest Updates

Maintenances

Change log

Known issues

ICON jobs

Cluster specifications

Hardware

File systems

Slurm configuration

Oversubscription of GPU cards

User environments

Container engine

Content

Space Tools

Breadcrumbs

Alps Santis (Climate and Weather Platform - CWP)

Latest Updates

Maintenances

Change log

Known issues

ICON jobs

Cluster specifications

Hardware

File systems

Slurm configuration

Oversubscription of GPU cards

User environments

Container engine