Julia is a programming language that was designed to solve the "two-language problem", the problem that prototypes written in an interactive high-level language like MATLAB, R or Python need to be partly or fully rewritten in lower-level languages like C, C++ or Fortran when a high-performance production code is required. Julia, which has its origins at MIT, can however reach the performance of C, C++ or Fortran despite being high-level and interactive. This is possible thanks to Julia's just-ahead-of-time compilation: code can be executed in an interactive shell as usual for prototyping languages, but functions and code blocks are compiled to machine code right before their first execution instead of being interpreted (note that modules are pre-compiled).
Julia is optimally suited for parallel computing, supporting e.g. MPI and threads similar to OpenMP. Moreover, Julia's CUDA package enables writing native Julia code for GPUs [1,2], which can reach similar efficiency as CUDA C/C++ [3,4]. Julia was shown to be suitable for scientific GPU supercomputing at large scale, enabling nearly ideal scaling on thousands of GPUs on Piz Daint [3,4,5]. Furthermore, Julia permits direct calling of C/C++ and Fortran libraries without glue code. It also features similar interfaces to prototyping languages as, e.g., Python, R and MATLAB. Finally, the Julia PackageCompiler enables to compile Julia modules in order to create shared libraries that are callable from C or other languages (c.f. this Proof of Concept).
Licensing Terms and Conditions
Julia is distributed under the MIT license. Julia is free for everyone to use and all source code is publicly available on GitHub.
Setup
There are two Julia modules available on Piz Daint:
- Julia
- JuliaExtensions
The module Julia
contains the Julia language and MPI if loaded with daint-mc
; it includes also the CUDA packages if loaded with daint-gpu
(information on the usage of daint-gpu
and daint-mc
can be found here). The module daint-gpu
is considered by default in the examples below, whereas the equivalent command with daint-mc
is provided as a comment after the symbol #
. You can load the Julia
module by typing:
module load daint-gpu # module load daint-mc module load Julia
The module JuliaExtensions
contains some additional useful Julia packages as for instance Plots
, HDF5
and PyCall
. You can load the JuliaExtensions
module by typing:
module load daint-gpu # module load daint-mc module load JuliaExtensions
Note that this will load also automatically the module Julia
it depends on. To see which packages JuliaExtensions
contains type:
module whatis JuliaExtensions
Refer to the Julia documentation for installing additional packages.
How to run on Piz Daint
Running an interactive Julia session on a compute node is straightforward. Just start an interactive bash session and then execute julia
, i.e. type in your shell:
srun -C gpu -n 1 -A <project> --time=02:00:00 --pty bash # srun -C mc -n 1 -A <project> --time=02:00:00 --pty bash julia
Please replace the string <project>
with the ID of the active project that will be charged for the allocation. Then, you will already be able to interactively execute computations on the allocated GPU/ CPUs (set the allocation time according to your needs).
Production simulations can be scheduled as usual with SLURM. Here is an example showing how to run a 3-D heat diffusion solver which uses MPI for inter-GPU communication on 8 GPUs (the example is available here):
#!/bin/bash -l #SBATCH --job-name="diffusion3D_multigpu_CuArrays" #SBATCH --time=01:00:00 #SBATCH --nodes=8 #SBATCH --ntasks-per-node=1 #SBATCH --partition=normal #SBATCH --constraint=gpu #SBATCH --account <project> module load daint-gpu module load JuliaExtensions srun julia -O3 --check-bounds=no diffusion3D_multigpu_CuArrays.jl
Note that the flag -O3
activates level 3 optimization for Julia's just-ahead-of-time compilation and --check-bounds=no
deactivates out-of bound checking for indices to improve performance. You might also want to add --math-mode=fast
which enables some floating point optimizations. However, be aware that these can affect results. For more information on available flags type:
julia --help
If your code uses CPU multithreading then you can activate it by inserting the following lines into your job script:
#SBATCH --cpus-per-task=12 # --cpus-per-task=36 #SBATCH --hint=nomultithread export JULIA_NUM_THREADS=$SLURM_CPUS_PER_TASK
Further Documentation
References
[1] Besard, T., Foket, C., & De Sutter B. (2018). Effective Extensible Programming: Unleashing Julia on GPUs. IEEE Transactions on Parallel and Distributed Systems, 30(4), 827-841
[2] Besard, T., Churavy, V., Edelman, A., & De Sutter B. (2019). Rapid software prototyping for heterogeneous and distributed platforms. Advances in Engineering Software, 132, 29-46
[3] Räss, L., Omlin, S., & Podladchikov, Y. Y. (2019). A Nonlinear Multi-Physics 3-D Solver: From CUDA C + MPI to Julia. PASC19 Conference, Zurich, Switzerland.
[4] Räss, L., Omlin, S., & Podladchikov, Y. Y. (2019). Porting a Massively Parallel Multi-GPU Application to Julia: a 3-D Nonlinear Multi-Physics Flow Solver. JuliaCon Conference, Baltimore, US.
[5] Omlin, S., Räss, L., Kwasniewski, G., Malvoisin, B., & Podladchikov, Y. Y. (2020). Solving Nonlinear Multi-Physics on GPU Supercomputers with Julia. JuliaCon Conference, virtual.