Daint (daint.alps) is a vCluster deployed on Alps for the HPC Platform. It replaces the Cray XC Piz Daint, which has reached end-of-life. 


Maintenance

  • Tuesday morning 8-12 CET is reserved for periodic updates, with services potentially unavailable during this timeframe
  • Exceptional and non-disruptive updates may happen outside this time frame and will be announced to the users mailing list

Change log

 

The version of uenv  on daint, clariden and santis was been updated this morning with new features, bug fixes and other improvements. It also introduces some breaking changes, the main one is that the uenv view command is no longer supported

  • provide the view when starting the environment, e.g. uenv start --view=gromacs gromacs/2024:v1

Please check the updated documentation available on the dedicated page uenv user environments

The Slurm queue xfer  is available for the Internal Transfer on Daint. The data mover nodes mount temporarily with read-only access  /project.CrayXC, /store.CrayXC and /users.CrayXC that correspond to the GPFS mount points /project, /store and /users currently available on Piz Daint XC login nodes

Known issues

Quota command 

The quota command is unavailable on the login nodes.

Jobs silently crashing in Slurm prolog

We are investigating an issue where interactive jobs are silently crashing in the Slurm prolog and being removed from the queue. 

$ [daint][user@daint-ln001 ~]$ srun  -p debug --pty bash
srun: job 152105 queued and waiting for resources
<hanging, but the job has been cancelled by slurm>

Access

Log in as you would for other CSCS systems, by first configuring your SSH keys (see the MFA documentation for more information), then log into Daint via the front end server ela.cscs.ch:

$ ssh -A ela.cscs.ch
$ ssh daint.alps.cscs.ch


Simplifying log in

To log in directly to Daint without first logging into ela.cscs.ch , you can add the following configuration to the ~/.ssh/config  file on your laptop or PC:

~/.ssh/config
Host ela
    HostName ela.cscs.ch
    User <<username>>
    IdentityFile ~/.ssh/cscs-key
Host daint.alps
    HostName daint.alps.cscs.ch
    User <<username>>
    IdentityFile ~/.ssh/cscs-key
    ProxyJump ela
    AddKeysToAgent yes
    ForwardAgent yes

Where cscsusername is your CSCS account name.

Now you can access Daint directly from your laptop or PC:

ssh daint.alps


Cluster Specifications

All nodes are identical, with 4 Grace-Hopper modules per node. Specifically:

  • User Access Node (UAN): 4 repurposed GH compute nodes that serve as login nodes (daint-ln00[1-4])
  • Compute Node (CN): The number of compute nodes will change over time. You can get an up-to-date number using the command sinfo -s on the UAN. The majority of the nodes is provided in the Slurm partition normal, while a smaller number is accessible through the partition debug, meant for short test jobs with a quick turnaround   

Each node has approximately 800GB of free memory accessible from all sockets. Each Grace CPU has 72 cores with the following specification:

  • Arm V9.0 ISA compliant aarch64 (Neoverse V2 “Demeter”architecture)
  • Full SVE-2 Vector Extensions support, inclusive of NEON instructions
  • Supports 48-bit virtual and 48-bit physical address space

Each Hopper GPU has 96GB of RAM. NVLINK provides all-to-all cache-coherent memory between all host and device memory.

A login node is a shared resource. Do not run compute-intensive jobs on a login node, and do not start the CUDA MPS service there, as you might impact the work of others. Please have a look at the Policies that apply on CSCS computing systems 

Programming Environment

uenv

User environments uenv are used to provide programming environments and application software. Please refer to the uenv  documentation for detailed information on how to use the uenv  tools on the system.
You can list the uenv provided on the system with the command uenv image find. A non exhaustive list of the software provided by uenv images currently available on the system is shown below:


Please note that uenv images provided on the system Todi should also work on Daint. They can be accessed via

CLUSTER_NAME=todi uenv image find

Please always prepend CLUSTER_NAME=todi to uenv commands that should address an image available on Todi.

Cray Programming Environment (CPE)

CPE is provided on Daint, however CSCS does not officially support CPE or provide software built with CPE on the system.
The supported method for building software is uenv (see above): this is a key difference from the previous system Piz Daint (Cray XC).

To enable Cray Programming Environment (CPE), please run

$ module load cray
$ module load PrgEnv-cray

# to check that CPE is loaded:
$ ftn --version
Cray Fortran : Version 17.0.0

Container engine

The Container Engine (CE) is available on the system: this is a toolset is designed to enable computing jobs to seamlessly run inside Linux application containers, thus providing support for containerized user environments: please see the dedicated page to use CE.

File systems

The following mount points are available:

  • /users/$USER - $HOME
  • /capstor/scratch/cscs/$USER  - $SCRATCH
  • /capstor/store - $PROJECT  

The environment variable $HOME and $SCRATCH will give you have access to the user dedicated folders /users/$USER and /capstor/cscs/scratch/$USER respectively. The $PROJECT environment variable targets your personal folder /capstor/store/cscs/userlab/group_id>/$USER only for UserLab customers on the capstor storage: please note that users need to create their own sub-folders under the project folder, as they are not created automatically.

Please check the occupancy by your user (data volume and number of inodes) on the different file systems with the quota command, that is currently available on the frontend Ela.

Slurm configuration

Currently there is no fair share policy applied.

PartitionMax timeMax nodesBrief Description
normal24 h568Standard queue for production work
debug30 min10Quick turnaround for test jobs (one per user)
xfer24 h1Internal transfer queue

Nodes are not shared (except for the xfer queue) and at least 1 node must be allocated for your job.

Running jobs

In the current Slurm configuration OMP threads are placed consecutively on the cores and MPI ranks are placed in the round-robin fashion between 4 sockets. Example below demonstrates the output of running 8 MPI ranks with 4 cores/rank:

$ srun -N1 -c4 -n8 ./a.out
MPI rank : 6, OMP thread id : 0, CPU core id : 149
MPI rank : 6, OMP thread id : 3, CPU core id : 148
MPI rank : 6, OMP thread id : 1, CPU core id : 150
MPI rank : 6, OMP thread id : 2, CPU core id : 151
MPI rank : 0, OMP thread id : 1, CPU core id : 2
MPI rank : 0, OMP thread id : 0, CPU core id : 1
MPI rank : 0, OMP thread id : 3, CPU core id : 0
MPI rank : 0, OMP thread id : 2, CPU core id : 3
MPI rank : 4, OMP thread id : 0, CPU core id : 5
MPI rank : 4, OMP thread id : 2, CPU core id : 7
MPI rank : 4, OMP thread id : 3, CPU core id : 4
MPI rank : 4, OMP thread id : 1, CPU core id : 6
MPI rank : 5, OMP thread id : 0, CPU core id : 76
MPI rank : 5, OMP thread id : 3, CPU core id : 79
MPI rank : 5, OMP thread id : 2, CPU core id : 78
MPI rank : 5, OMP thread id : 1, CPU core id : 77
MPI rank : 1, OMP thread id : 0, CPU core id : 73
MPI rank : 1, OMP thread id : 3, CPU core id : 72
MPI rank : 1, OMP thread id : 2, CPU core id : 75
MPI rank : 1, OMP thread id : 1, CPU core id : 74
MPI rank : 3, OMP thread id : 0, CPU core id : 216
MPI rank : 3, OMP thread id : 1, CPU core id : 217
MPI rank : 3, OMP thread id : 2, CPU core id : 218
MPI rank : 3, OMP thread id : 3, CPU core id : 219
MPI rank : 7, OMP thread id : 0, CPU core id : 220
MPI rank : 7, OMP thread id : 3, CPU core id : 221
MPI rank : 7, OMP thread id : 1, CPU core id : 223
MPI rank : 7, OMP thread id : 2, CPU core id : 222
MPI rank : 2, OMP thread id : 0, CPU core id : 147
MPI rank : 2, OMP thread id : 3, CPU core id : 144
MPI rank : 2, OMP thread id : 2, CPU core id : 145
MPI rank : 2, OMP thread id : 1, CPU core id : 146

Oversubscription of GPU cards

If you want to share GPU card(s) between multiple MPI ranks, you currently need to start multi-process daemon on the node yourself. This is due to the fact that CRAY_CUDA_MPS variable is no longer supported. To do so, you need to use a simple wrapper script:

MPS wrapper script
#!/bin/bash
# Example mps-wrapper.sh usage:
# > srun [srun args] mps-wrapper.sh [cmd] [cmd args]
export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps
export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log
export CUDA_VISIBLE_DEVICES=$(( SLURM_LOCALID % 4 ))
# Launch MPS from a single rank per node
if [ $SLURM_LOCALID -eq 0 ]; then
    CUDA_VISIBLE_DEVICES=0,1,2,3 nvidia-cuda-mps-control -d
fi
# Wait for MPS to start
sleep 5
# Run the command
"$@"
# Quit MPS control daemon before exiting
if [ $SLURM_LOCALID -eq 0 ]; then
    echo quit | nvidia-cuda-mps-control
fi

and run your code using the following sample slurm script:

Batch submission script
#!/bin/bash -l
#SBATCH --job-name=<job_name>
#SBATCH --time=01:30:00 #HH:MM:SS
#SBATCH --nodes=2
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=32 #32 MPI ranks per node
#SBATCH --account=<account> 
#SBATCH --hint=nomultithread 
#SBATCH --hint=exclusive

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export MPICH_MALLOC_FALLBACK=1

ulimit -s unlimited

srun --cpu-bind=socket ./mps-wrapper.sh <code> <args>


  • No labels