Page History
It is possible to use the Singularity container platform on Eiger and Piz Daint. In order to make make singularity
available to your environment, you should use the following commandcommands:
Code Block | ||||
---|---|---|---|---|
| ||||
# load singularity on Eiger ml cray singularity #ml load singularity on Piz Daint module load singularitysingularity |
The module defines the necessary environment variables so that required host system Singularity can run GPU-enabled and MPI containers. CSCS offers the additional modulefile singularity/3.6.4-daint
customized for Piz Daint, in order to fully exploit the system features. The module defines the necessary environment variables so that required host system directories are mounted inside the container by singularity. Furthermore, the the LD_LIBRARY_PATH is set so that the necessary dynamic libraries are available at runtime. Using the CSCS provided module, the MPI installed in the container image is replaced by the one of the host (Piz Daint) which takes advantage of the high-speed Cray Aries interconnect. The aforementioned module can be loaded as follows:
Code Block | ||||
---|---|---|---|---|
| ||||
module load daint-gpu # or daint-mc
module load singularity/3.6.4-daint |
The following requirements have to be met by the container images:
The following requirements have to be met by the container images:
- MPI-enabled containers must link dynamically the application inside the container
- For GPU-enabled containers, the version of CUDA inside the container has to be supported by the Nvidia driver of the host
- For MPI-enabled containers, the application inside the container must be dynamically linked to an MPI version that is ABI-compatible with the host MPI
...
Code Block | ||||
---|---|---|---|---|
| ||||
srun -A <project> -C gpumc singularity pull docker://ubuntu:latest |
...
Code Block | ||||
---|---|---|---|---|
| ||||
srun -A <project> -C gpumc --account=<project> singularity exec ubuntu_latest.sif cat /etc/os-release |
...
Code Block | ||||
---|---|---|---|---|
| ||||
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal |
Building container images on your local computer
In order to build container images from singularity definition files, root privileges are required. Therefore, it is not possible to use singularity to build your images on CSCS systems. The suggested method in this case is to build the image using your local computer and then transfer the resulting image to CSCS systems using the Data Transfer service.
Running a GPU-enabled container
In this example we are using the following singularity definition file, cuda_device_query.def to build a container image with singularity:
Code Block | ||||
---|---|---|---|---|
| ||||
Bootstrap: docker
From: nvidia/cuda:10.2-devel
%post
apt-get update
apt-get install -y git
git clone https://github.com/NVIDIA/cuda-samples.git /usr/local/cuda_samples
cd /usr/local/cuda_samples
git fetch origin --tags
git checkout v10.2
cd Samples/deviceQuery && make
%runscript
/usr/local/cuda_samples/Samples/deviceQuery/deviceQuery |
Based on the cuda_device_query.def definition file given above, we can build the image cuda_device_query.sif using singularity on your local workstation with root access via sudo
:
Code Block | ||||
---|---|---|---|---|
| ||||
sudo singularity build cuda_device_query.sif cuda_device_query.def |
The final lines of the above command output should look like this:
Code Block | ||||
---|---|---|---|---|
| ||||
INFO: Adding runscript
INFO: Creating SIF file...
INFO: Build complete: cuda_device_query.sif |
The command will create the image file cuda_device_query.sif which can be transferred using the Data Transfer service to the CSCS system. For instance, the following commands are used to run the image on Piz Daint gpu nodes after loading the singularity
module as explained above:
Code Block | ||||
---|---|---|---|---|
| ||||
# load the sungularity modules if not already loaded
module load daint-gpu
module load singularity/3.6.4-daint
# run singularity using cuda_device_query.sif
srun -A<project> -C gpu singularity run --nv cuda_device_query.sif |
The output of the command above is the following:
Code Block | ||||
---|---|---|---|---|
| ||||
/usr/local/cuda_samples/Samples/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Tesla P100-PCIE-16GB"
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 6.0
Total amount of global memory: 16281 MBytes (17071734784 bytes)
(56) Multiprocessors, ( 64) CUDA Cores/MP: 3584 CUDA Cores
GPU Max Clock rate: 1329 MHz (1.33 GHz)
Memory Clock rate: 715 Mhz
Memory Bus Width: 4096-bit
L2 Cache Size: 4194304 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 0
Compute Mode:
< Exclusive Process (many threads in one process is able to use ::cudaSetDevice() with this device) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS |
...
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal |
Building container images on your local computer
In order to build container images from singularity definition files, root privileges are required. Therefore, it is not possible to use singularity to build your images on CSCS systems. The suggested method in this case is to build the image using your local computer and then transfer the resulting image to CSCS systems using the Data Transfer service.
...
Running an MPI enabled container:
...
The command will create the image mpi_osu.sif which can be transferred to the CSCS system. For instance on Piz Daint, the following commands are used to run the created image:be transferred to the CSCS system.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
# load the corresponding modules if not already loaded module load daint-gpu # or daint-mc module load singularity/3.6.4-daint # run mpi_osu.sif with singularity using 2 compute nodes srun -C gpumc -N2 --account=<project> singularity run mpi_osu.sif |
...
It is possible to run an MPI container without replacing the container's MPI with the host one. In order to do so, we have to instruct Slurm to use the PMI-2 process management interface. Furthermore, the container's MPI has to be configured with PMI-2 enabled. Therefore, the container image used in the previous example can be run as follows on Piz Daint:
Code Block | ||||
---|---|---|---|---|
| ||||
Code Block | ||||
| ||||
# load the corresponding modules if not already loaded module load daint-gpu # or daint-mc module load singularity/3.6.4-daint # run using 2 compute nodes using the PMI-2 interface srun --mpi=pmi2 -A<project> -C gpumc -N2 singularity run mpi_osu.sif |
...
Code Block | ||||
---|---|---|---|---|
| ||||
# OSU MPI Bandwidth Test v5.3.2 # Size Bandwidth (MB/s) 1 0.44 2 0.87 4 1.75 8 3.49 16 7.02 32 14.12 64 27.69 128 55.51 256 110.65 512 161.96 1024 181.88 2048 355.33 4096 678.33 8192 1328.71 16384 2440.92 32768 3277.84 65536 4343.05 131072 4139.05 262144 4596.31 524288 4888.90 1048576 5094.95 2097152 5149.54 4194304 5180.42 | ||||
Note | ||||
In order to use the MPI library of the container, the module singularity/3.6.4 should be loaded instead of singularity/3.6.4-daint |
Additional information
Please consult the official Singularity documentation for additional information.
...