Reference Documentation
The following exercises will help you understand how to build your own software on the ATOS HPCF or ECS.
Before we start...
Ensure your environment is clean by running:
module reset
Create a directory for this tutorial and cd into it:
mkdir -p compiling_tutorial cd compiling_tutorial
Building simple programs
With your favourite editor, create three hello world programs, one in C, one in C++ and one in Fortran
Compile and run each one of them with the GNU compilers (
gcc
,g++
,gfortran
)Now, use the generic environment variables for the different compilers (
$CC
,$CXX
,$FC
) and rerun, you should see no difference to the above results.
Managing Dependencies
We are now going to use a simple program that will display versions of different libraries linked to it. Create a file called
versions.c
using your favourite editor with the following contents:Try to naively compile this program with:
$CC -o versions versions.c
- The compilation above fails as it does not know where to find the different libraries. We need to add some additional flags so the compiler can find both the include headers and link to the actual libraries.
Let's use the existing software installed on the system with modules, and benefit from the corresponding environment variables
*_DIR
which are defined in them to manually construct the include and library flags:$CC -o versions versions.c -I$HDF5_DIR/include -I$NETCDF4_DIR/include -I$ECCODES_DIR/include -L$HDF5_DIR/lib -lhdf5 -L$NETCDF4_DIR/lib -lnetcdf -L$ECCODES_DIR/lib -leccodes
Load the appropriate modules so that the line above completes successfully and generates the
versions
executable:Run
./versions
. You will get an error such as the one below:./versions: error while loading shared libraries: libhdf5.so.200: cannot open shared object file: No such file or directory
While you passed the location of the libraries at compile time, the program cannot not find the libraries at runtime. Inspect the executable with
ldd
to see what libraries are missingCan you make that program run successfully?
Can you rebuild the program so it uses the "old" versions of all those libraries in modules? Ensure the output of the program matches the versions loaded in modules? Do the same with the latest.
To simplify the build process, let's create a simple Makefile for this program. With your favourite editor, create a file called
Makefile
in the same directory with the following contents:Watch the indentation
Make sure that the indentations at the beginning of the lines are tabs and not spaces!
You can test it works by running:
make clean test ldd
Using different toolchains: prgenv
So far we have used the default compiler toolchain to build this program. Because of the installation paths of the library, it is easy to see both the version of the library used as well as the compiler flavour with ldd:
$ make ldd libhdf5.so.200 => /usr/local/apps/hdf5/<HDF5 version>/GNU/8.5/lib/libhdf5.so.200 (0x000014f612b7d000) libnetcdf.so.19 => /usr/local/apps/netcdf4/<NetCDF version>/GNU/8.5/lib/libnetcdf.so.19 (0x000014f611f2a000) libeccodes.so => /usr/local/apps/ecmwf-toolbox/<ecCodes version>/GNU/8.5/lib/libeccodes.so (0x000014f611836000)
Rebuild the program with:
- The default GNU GCC compiler.
- The default Classic Intel compiler.
- The default LLVM-based Intel compiler.
- The default AMD AOCC.
Use the following command to test and show what versions of the libraries are being used at any point:
make clean test
Rebuild the program with the "new" GNU GCC compiler. Use the same command as above to test and show what versions of the libraries are being used at any point.
Rebuild the program with the Classic Intel compiler once again, but this time reset your module environment once the executable has been produced and before running it. What happens when you run it?
Bringing MPI into the mix
Beyond the different compiler flavours in offer, we can also choose different MPI implementations for our MPI parallel programs. On the Atos HPCF and ECS, we can choose from the following implementations:
Implementation | Module | Description |
---|---|---|
OpenMPI | openmpi | Standard OpenMPI implementation provided by Atos |
Intel MPI | intel-mpi | Intel's MPI implementation based on MPICH. Part of part of the Intel OneAPI distribution |
HPC-X OpenMPI | hpcx-openmpi | NVIDIA's optimised flavour of OpenMPI. This is the recommended option |
For the next exercise, we will use this adapted hello world code for MPI.
Reset your environment with:
module reset
With your favourite editor, create the file
mpiversions.c
with the code above, and compile it into the executablempiversions
. Hint: You may use the modulehpcx-openmpi
.Write a small batch job that will compile and run the program using 2 processors and submit it to the batch system:
Tweak the previous job to build and run the
mpiversions
program with as many combinations of compiler families and MPI implementations as you can.
Real-world example: CDO
To put into practice what we have learned so far, let's try to build and install CDO. You would typically not need to build this particular application, since it is already available as part of the standard software stack via modules or easily installable with conda. However, it is a good illustration of how to build a real-world software with dependencies to other software packages and libraries.
The goal of this exercise is for you to be able to build CDO and install it under one of your storage spaces (HOME or PERM), and then successfully run:
<PREFIX>/bin/cdo -V
You will need to:
- Familiarise yourself with the installation instructions of this package in the official documentation.
- Decide your installation path and your build path.
- Download the source code from the CDO website.
- Set up your build environment (i.e. modules, environment variables) for a successful build
- Build and install the software
- Test that works with the command above.
Make sure that CDO is built at least with support to:
- NetCDF
- HDF5
- SZLIB (hint: use AEC)
- ecCodes
- PROJ
- CMOR
- UDUNITS
Building in Batch
It is strongly recommended you bundle all your build process in a job script that you can submit in batch. That way you can request additional cpus and speed up your compilation exploiting build parallelism with make -j
If you would like a starting point for such a job, you can start from the following example, adding and amending the necessary bits as needed: