NAMD is a parallel molecular dynamics code based on Charm++ designed for high-performance simulation of large biomolecular systems.

NAMD Single Node

Currently, NAMD is provided without MPI support and therefore it can run only on a single node, and take advantage of the new GPU-resident mode.

Licensing Terms and Conditions

NAMD is distributed free of charge for research purposes only and not for commercial use: users must agree to NAMD license in order to use it at CSCS. Users agree to acknowledge use of NAMD in any reports or publications of results obtained with the Software (see NAMD Homepage for details).

ALPS (GH200)

You can obtain NAMD's Spack stack as follows

# List available images
uenv image find namd

# Pull the image of interest
uenv image pull namd/3.0b6:latest

# Start uenv
uenv start namd/3.0b6:latest

NAMD and its main dependencies are conveniently provided as modules. Therefore, once you started the user environment, you can simply run

uenv modules use
module load namd

in order to have the NAMD executable available.

Single-node, single- or multi-GPU

The single-node build works on a single node and benefits from the new GPU-resident mode (see NAMD 3.0b6 GPU-Resident benchmarking results for more details).


Run STMV benchmarks
srun -N 1 -n 1 namd3 +p 8 +setcpuaffinity +devices 0 stmv_gpures_nve.namd
srun -N 1 -n 1 namd3 +p 15 +pmepes 7 +setcpuaffinity +devices 0,1 stmv_gpures_nve.namd
srun -N 1 -n 1 namd3 +p 29 +pmepes 5 +setcpuaffinity +devices 0,1,2,3 stmv_gpures_nve.namd

Scaling of the tobacco mosaic virus (STMV) benchmark with GPU-resident mode on our system is the following:

GPUsns/daySpeed-UpParallel efficiency
1

31.1463

-

-
253.65251.7286%
492.69272.9874%

The official NAMD 3.0b6 GPU-Resident benchmarking results provide results for A100 GPUs as well as the older GPU-offload mode. The following graphs compares results on A100 (official benchmarks) and GH200 (our results) for both GPU-resident and GPU-offload modes.

Piz Daint

Setup

You can see a list of the available versions of the program installed on the machine after loading the gpu or multicore module, by typing:

module load daint-gpu
module avail NAMD


for the GPU version or

module load daint-mc
module avail NAMD


for the multicore one. The previous set of commands will show the GPU or multicore enabled modules of the applications. The following module command will then load the environment of the default version of the program:

module load NAMD


You can either type this command every time you intend to use the program within a new session, or you can automatically load it by including it in your shell configuration file.

The following module commands will print the environment variables set by loading the program and a help message:

module show NAMD
module help NAMD


How to Run

The CUDA-enabled version of NAMD is installed on Daint. When using this version you should set outputEnergies to 100 or higher in the simulation config file, as outputting energies from the GPU is slower compared to the CPU, and you should add +idlepoll to the command line in order to poll the GPU for results rather than sleeping while idle. Note that some features are unavailable in the CUDA build, including alchemical free energy perturbation and the Lowe-Andersen thermostat.

The GPU code in NAMD is relatively new (introduced first in NAMD 2.7), and forces evaluated on the GPU differ slightly from a CPU-only calculation, so you should test your simulations well before launching production runs.

Note that multiple NAMD processes (or threads) can share the same GPU, and thus it is possible to run with multiple processes per node (see below).

The following job script asks for 16 nodes, using 1 MPI task per node and 24 threads per MPI task with hyperthreading turned on. If you use more than one MPI task per node you will need to set CRAY_CUDA_MPS=1 to enable the tasks to access the GPU device on each node at the same time.

#!/bin/bash -l
#
# NAMD on Piz Daint
#
# 32 nodes, 1 MPI task per node, 24 OpenMP threads per task with hyperthreading (--ntasks-per-core=2)
#
#SBATCH --job-name="namd"
#SBATCH --time=00:30:00
#SBATCH --nodes=16
#SBATCH --ntasks-per-core=2
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=24
#SBATCH --constraint=gpu
#========================================
# load modules and run simulation
module load daint-gpu
module load NAMD
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun namd2 +idlepoll +ppn $[SLURM_CPUS_PER_TASK-1] input.namd > namd.out

Scaling

We provide a NAMD scaling example simulating the dynamics of the tobacco mosaic virus (STMV).

We run the scaling jobs with the constraint gpu on the Cray XC50, using 1 MPI task per node and 24 threads per task. The performance matrics is the average of the values reported in days/ns in the output file of each simulation.

Running on 16 nodes with this small example the parallel efficiency is around 50%, namely the limit for this scaling indicator, while on 32 nodes the efficiency is ~ 30%.

The scaling data are reported in the table below:

NodesDays/nsSpeed-up
20.2861.00
40.1851.55
80.1152.49
160.0714.03
320.0614.69

Strong scaling results are plotted against ideal scaling as follows:

Further Documentation

NAMD Homepage

NAMD User's Guide