Alps (Daint) – hereafter referred to as "Daint" – is a vCluster deployed on Alps for the HPC Platform. It is replacing the Cray XC Piz Daint, which is reaching end of life. Access to Daint is being granted in batches to User Lab projects migrating from the Cray XC system.
Daint is in its early access phase. CSCS engineers are working continuously on the system to improve the quality of service. Users are expected to perform their tests and benchmarks with care. Please report issues on the CSCS Service Desk
Change log
In order to complete the preparatory work necessary to deliver Alps in production, as of September 18 2024 the vCluster Daint on Alps will no longer be accessible until further notice: the early access will still be granted on Tödi. Please consult Todi early access for information on the vCluster Tödi on Alps
You can now access Alps (Daint) with ssh daint.alps.cscs.ch
from the front end
The are are now 604 GH compute nodes available for use through Slurm
Daint is deployed with one login node and ca. 400 GH compute nodes available for use through Slurm
Access
Log in as you would for other CSCS systems, by first configuring your SSH keys (see the MFA documentation for more information), then log into Daint via the front end server ela.cscs.ch
:
$ ssh -A ela.cscs.ch $ ssh daint.alps.cscs.ch
Simplifying log in
To log in directly to Daint without first logging into ela.cscs.ch
, you can add some configuration to the ~/.ssh/config
file on your laptop or PC:
Host ela HostName ela.cscs.ch User <<username>> IdentityFile ~/.ssh/cscs-key Host daint.alps HostName daint.alps.cscs.ch User <<username>> IdentityFile ~/.ssh/cscs-key ProxyJump ela AddKeysToAgent yes ForwardAgent yes
Where cscsusername
is your CSCS account name.
Now you can access Daint directly from your laptop or PC:
ssh daint.alps
Cluster Specifications
All nodes are identical, with 4 Grace-Hopper modules per node. Specifically:
- login nodes: 1 repurposed GH compute nodes (daint-ln001)
- compute nodes:
The number of compute nodes will change over time; you can get an up-to-date number using the commandsinfo
. The majority of the nodes is provided in a single slurm partition callednormal
. A smaller number is accessible through the slurm partitiondebug
which provides fast turnaround (maximum wall time of 30 min and maximum number of nodes of 10)
Each node has approximately 800GB of free memory accessible from all sockets. Each Grace CPU has 72 cores with the following specification:
- Arm V9.0 ISA compliant aarch64 (Neoverse V2 “Demeter”architecture)
- Full SVE-2 Vector Extensions support, inclusive of NEON instructions
- Supports 48-bit virtual and 48-bit physical address space
Each Hopper GPU has 96GB of RAM. NVLINK provides all-to-all cache-coherent memory between all host and device memory.
A login node is a shared resource. Do not run compute-intensive jobs on a login node, and do not start the CUDA MPS service there, as you might impact the work of others.
Programming Environment
UENV
Uenv are used to provide programming environments and application software. Please refer to main uenv documentation for detailed documentation on how to use the uenv tools installed on the system. On Daint the following uenv are provided (uenv image find):
- CP2K
- NAMD
- Linaro Forge
- prgenv-gnu
- Quantumespresso
- VASP
Coming soon:
Please note that UENV images provided on the system Todi should also work on Daint, therefore users are encouraged to test them
Cray Programming Environment (CPE)
CPE is provided on Daint, however CSCS does not officially support CPE or provide software build with CPE on the system.
The supported method for building software will be uenv (see above): this is a key difference from the old Cray XC Piz Daint
To enable Cray Programming Environment (CPE), please run
$ module load cray $ module load PrgEnv-cray $ module load craype-arm-grace # to check that PE is loaded: $ ftn --version Cray Fortran : Version 17.0.0
Container engine
The Container Engine is available. Please see the page on using container images with it.
File systems
The following mount points are available:
/users/$USER
-$HOME
/capstor/scratch/cscs/$USER
-$SCRATCH
/store
You always will have access to /users
and /capstor/cscs/scratch
. Access to the other locations will depend on the allocations set in your project.
In order to check the occupancy by your user (data volume and number of inodes), you can run the quota
command on the login nodes.
Slurm configuration
Currently there is no fair share policy applied.
Partition | Max time | Max nodes | Brief Description |
---|---|---|---|
normal | 24 h | 604 | Standard queue for production work |
debug | 30 min | 10 | Quick turnaround for test jobs (one per user) |
low | 24 h | 604 | Up to 130% of project's quarterly allocation |
prepost | 30 min | 1 | High priority pre/post processing |
Nodes are not shared and at least 1 node must be allocated for your job.
Running jobs
In the current Slurm configuration OMP threads are placed consecutively on the cores and MPI ranks are placed in the round-robin fashion between 4 sockets. Example below demonstrates the output of running 8 MPI ranks with 4 cores/rank:
Oversubscription of GPU cards
If you want to share GPU card(s) between multiple MPI ranks, you currently need to start multi-process daemon on the node yourself. This is due to the fact that CRAY_CUDA_MPS
variable is no longer supported. To do so, you need to use a simple wrapper script:
#!/bin/bash # Example mps-wrapper.sh usage: # > srun [srun args] mps-wrapper.sh [cmd] [cmd args] export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log export CUDA_VISIBLE_DEVICES=$(( SLURM_LOCALID % 4 )) # Launch MPS from a single rank per node if [ $SLURM_LOCALID -eq 0 ]; then CUDA_VISIBLE_DEVICES=0,1,2,3 nvidia-cuda-mps-control -d fi # Wait for MPS to start sleep 5 # Run the command "$@" # Quit MPS control daemon before exiting if [ $SLURM_LOCALID -eq 0 ]; then echo quit | nvidia-cuda-mps-control fi
and run your code using the following sample slurm script:
#!/bin/bash -l #SBATCH --job-name=<job_name> #SBATCH --time=01:30:00 #HH:MM:SS #SBATCH --nodes=2 #SBATCH --ntasks-per-core=1 #SBATCH --ntasks-per-node=32 #32 MPI ranks per node #SBATCH --account=<account> #SBATCH --hint=nomultithread #SBATCH --hint=exclusive export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK export MPICH_MALLOC_FALLBACK=1 ulimit -s unlimited srun --cpu-bind=socket ./mps-wrapper.sh <code> <args>