Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

It is possible to use the Singularity container platform on Eiger and Piz Daint. In order to make make singularity available to your environment, you should use the following commandcommands:

Code Block
languagebash
themeRDark
# load singularity on Eiger
ml cray singularity
 
#ml load singularity on Piz Daint
module load singularitysingularity

The module defines the necessary environment variables so that required host system Singularity can run GPU-enabled and MPI containers. CSCS offers the additional modulefile singularity/3.6.4-daint customized for Piz Daint, in order to fully exploit the system features. The module defines the necessary environment variables so that required host system directories are mounted inside the container by singularity. Furthermore, the the LD_LIBRARY_PATH is set so that the necessary dynamic libraries are available at runtime. Using the CSCS provided module, the MPI installed in the container image is replaced by the one of the host (Piz Daint) which takes advantage of the high-speed Cray Aries interconnect. The aforementioned module can be loaded as follows:

Code Block
languagebash
themeRDark
module load daint-gpu # or daint-mc
module load singularity/3.6.4-daint

The following requirements have to be met by the container images:

The following requirements have to be met by the container images:

  • MPI-enabled containers must link dynamically the application inside the container
  • For GPU-enabled containers, the version of CUDA inside the container has to be supported by the Nvidia driver of the host
  • For MPI-enabled containers, the application inside the container must be dynamically linked to an MPI version that is ABI-compatible with the host MPI

...

Code Block
languagebash
themeRDark
srun -A <project> -C gpumc singularity pull docker://ubuntu:latest

...

Code Block
languagebash
themeRDark
srun -A <project> -C gpumc --account=<project> singularity exec ubuntu_latest.sif cat /etc/os-release

...

Code Block
languagetext
themeRDark
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Building container images on your local computer

In order to build container images from singularity definition files, root privileges are required. Therefore, it is not possible to use singularity to build your images on CSCS systems. The suggested method in this case is to build the image using your local computer and then transfer the resulting image to CSCS systems using the Data Transfer service.

Running a GPU-enabled container

In this example we are using the following singularity definition file, cuda_device_query.def to build a container image with singularity:

Code Block
languagetext
themeRDark
Bootstrap: docker
From: nvidia/cuda:10.2-devel

%post
    apt-get update
    apt-get install -y git
    git clone https://github.com/NVIDIA/cuda-samples.git /usr/local/cuda_samples
    cd /usr/local/cuda_samples
    git fetch origin --tags
    git checkout v10.2
    cd Samples/deviceQuery && make

%runscript
    /usr/local/cuda_samples/Samples/deviceQuery/deviceQuery

Based on the cuda_device_query.def definition file given above, we can build the image cuda_device_query.sif using singularity on your local workstation with root access via sudo:

Code Block
languagebash
themeRDark
sudo singularity build cuda_device_query.sif cuda_device_query.def

The final lines of the above command output should look like this:

Code Block
languagetext
themeRDark
INFO:    Adding runscript
INFO:    Creating SIF file...
INFO:    Build complete: cuda_device_query.sif

The command will create the image file cuda_device_query.sif which can be transferred using the Data Transfer service to the CSCS system. For instance, the following commands are used to run the image on Piz Daint gpu nodes after loading the singularity module as explained above:

Code Block
languagebash
themeRDark
# load the sungularity modules if not already loaded
module load daint-gpu
module load singularity/3.6.4-daint

# run singularity using cuda_device_query.sif
srun -A<project> -C gpu singularity run --nv cuda_device_query.sif

The output of the command above is the following:

Code Block
languagetext
themeRDark
/usr/local/cuda_samples/Samples/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla P100-PCIE-16GB"
  CUDA Driver Version / Runtime Version          10.2 / 10.2
  CUDA Capability Major/Minor version number:    6.0
  Total amount of global memory:                 16281 MBytes (17071734784 bytes)
  (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1329 MHz (1.33 GHz)
  Memory Clock rate:                             715 Mhz
  Memory Bus Width:                              4096-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
  Compute Mode:
     < Exclusive Process (many threads in one process is able to use ::cudaSetDevice() with this device) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS

...

VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Building container images on your local computer

In order to build container images from singularity definition files, root privileges are required. Therefore, it is not possible to use singularity to build your images on CSCS systems. The suggested method in this case is to build the image using your local computer and then transfer the resulting image to CSCS systems using the Data Transfer service.

...

Running an MPI enabled container:

...

The command will create the image mpi_osu.sif which can be transferred to the CSCS system. For instance on Piz Daint, the following commands are used to run the created image:be transferred to the CSCS system.

Code Block
languagebash
themeRDarkRDark
# load the corresponding modules if not already loaded
module load daint-gpu # or daint-mc
module load singularity/3.6.4-daint

# run mpi_osu.sif with singularity using 2 compute nodes
srun -C gpumc -N2 --account=<project> singularity run mpi_osu.sif

...

It is possible to run an MPI container without replacing the container's MPI with the host one. In order to do so, we have to instruct Slurm to use the PMI-2 process management interface. Furthermore, the container's MPI has to be configured with PMI-2 enabled. Therefore, the container image used in the previous example can be run as follows on Piz Daint:

Code Block
languagebash
themeRDark
Code Block
languagebash
themeRDark
# load the corresponding modules if not already loaded
module load daint-gpu # or daint-mc
module load singularity/3.6.4-daint

# run using 2 compute nodes using the PMI-2 interface
srun --mpi=pmi2 -A<project> -C gpumc -N2 singularity run mpi_osu.sif 

...

Code Block
languagetext
themeRDark
# OSU MPI Bandwidth Test v5.3.2
# Size      Bandwidth (MB/s)
1                       0.44
2                       0.87
4                       1.75
8                       3.49
16                      7.02
32                     14.12
64                     27.69
128                    55.51
256                   110.65
512                   161.96
1024                  181.88
2048                  355.33
4096                  678.33
8192                 1328.71
16384                2440.92
32768                3277.84
65536                4343.05
131072               4139.05
262144               4596.31
524288               4888.90
1048576              5094.95
2097152              5149.54
4194304              5180.42
Note
In order to use the MPI library of the container, the module singularity/3.6.4 should be loaded instead of singularity/3.6.4-daint

Additional information

Please consult the official Singularity documentation for additional information.

...