Compile Application
Last updated
Last updated
Two images are provided for development purposes.
They can be used as development containers. VS Code and CLion can be configured to use these containers for development. There is no need to install any dependencies on the host machine.
The tlcfem/suanpan-env
image also supports arm64
, in which OpenBLAS
is used as the linear algebra driver.
On AMD platforms, it is known that MKL
may throttle thus yields a poor performance, it may be necessary to use a specific version of OpenBLAS
or instead.
It is possible to compile the project with Docker.
Check the provided for more information.
One can build the image using the example Dockerfile
as it is.
For example,
Once the image is built, run the container and use sp
, suanpan
or suanPan
to invoke the program in the container.
Maybe it is necessary to map some folders to the container.
Docker files provide a standard reproducible environment and a reference configuration. One can always introduce adaptions to cater various needs.
The following is a general guide that covers three main operating systems.
It mainly targets the amd64
architecture.
It is strongly recommended installing Intel MKL for potentially better performance.
A number of new features from new standards are utilized. To compile the binary, a compiler that supports C++20 is required.
GCC 11, Clang 13, MSVC 14.3, Intel compilers and later version of those compilers are tested with the source code.
On other platforms (Linux and macOS), simply use GCC which comes with a valid Fortran compiler. Clang can also be used for C/CPP code, but since Clang and GCC have different supports for C++ new standards, successful compilation is not guaranteed with Clang.
This is highly tailored to my own machine. Thus, it is not recommended to use it directly. Instead, use VS Code with CMake extension to automatically configure the project.
A solution file is provided under MSVC/suanPan
folder. There are two configurations:
Debug
: Assume no available Fortran compiler, all Fortran related libraries are provided as precompiled DLLs. Use
OpenBLAS for linear algebra. Multithreading disabled. Visualisation disabled. HDF5 support disabled.
Release
: Fortran libraries are configured with Intel compilers. Use MKL for linear algebra. Multithreading enabled.
Visualisation enabled with VTK version 9.4. HDF5 support enabled. CUDA enabled.
If VTK, Intel oneAPI Toolkit and CUDA are not installed, only the Debug
configuration can be successfully compiled.
Simply open the solution and switch to Debug configuration, ignore all potential warnings and build the solution.
To compile Release
version, please
Make sure oneAPI both Base and HPC toolkits, as well as VS integration, are installed.
The MKL is enabled via integrated option <UseInteloneMKL>Parallel</UseInteloneMKL>
.
Make sure CUDA is installed. The environment variable $(CUDA_PATH)
is used to locate headers.
Make sure VTK is available. Then define a system environment variable $(VTK_DIR)
, which points
to the root folder of VTK library. On my machine, it is
Make sure MAGMA is available. Then define a system environment variable $(MAGMA_DIR)
, which points
to the root folder of MAGMA library. On my machine, it is
You probably need to compile MAGMA yourself. You can manually remove all magma related settings in the solution file if you don't want to use it.
Alternatively, CMake
can be used to generate solution files if some external packages are not available.
Open the source code folder with VS Code. Whether you choose GCC or MSVC, the configuration is done by CMake automatically.
Install necessary tools.
Clone the project.
Invoke make
.
Check the following recording.
Install VTK
Ubuntu official repository does not (Fedora does!) contain the latest VTK library. It's better to compile it manually.
Install OpenGL first, as well as compilers if necessary.
Obtain VTK source code and unpack.
Create folder for building VTK.
Configure and compile VTK library. If necessary, installation destination can be modified. Here static libraries are built.
Now obtain suanPan
source code and unpack it. To configure it with VTK support, users may use the following
flag -DSP_ENABLE_VTK=ON
. If FindVTK
is presented and VTK
is installed to default location, there is no need
to provide the variable VTK_DIR
, otherwise point it to the lib/cmake/vtk-9.1
folder.
Install MKL
The provided CMake configuration covers both oneMKL
and Intel MKL 2020
. Please note MKL is included in oneAPI
toolkit starting from 2021, which has a different folder structure compared to Intel Parallel Studio.
Add repository. To summarise,
Install the package.
Now compile suanPan
by enabling MKL via option -DSP_ENABLE_MKL=ON
. The corresponding MKLROOT
shall be assigned, for
example -DMKLROOT=/opt/intel/oneapi/mkl/latest/
, depending on the installation location. The configuration used
for snap is the following one.
VTK
Fedora offers the latest VTK library, simply install it.
MKL
First, create the repo
file.
Move it to the proper location.
Install MKL. You may perform a search sudo dnf search intel-oneapi-mkl-devel
to find which package name is available
and install the specific version if necessary.
The source can be compiled with VTK and MKL enabled.
The following guide is based on macOS Big Sur (11).
Install tools. gfortran
, llvm
and libomp
are used for compiling the main program, glfw
and glew
are required
for compiling VTK
. VTK
does not compile with GCC
. Here, we use Clang
.
Similar to Ubuntu, compile VTK
if wanted.
Obtain the source code and configure.
Any standard BLAS and LAPACK implementation can be used as the linear algebra driver which the application itself and Armadillo
rely on.
Currently, the following are supported and tested.
As a general guideline, when it comes to choose a proper implementation, the following points shall be considered.
AMD Optimizing CPU Libraries (AOCL)
is optimised for AMD CPUs based on BLIS
and FLAME
libraries, it performance on both platforms (and others) is superb.
Thus, use Intel oneAPI MKL
if it is preferred or an Intel platform is targeted.
The downside is that Intel oneAPI MKL
is proprietary and the final binary may have a large size.
For other cases, use AMD Optimizing CPU Libraries (AOCL)
when possible.
The downside it that it may need manual compilation of the libraries for the target OS.OpenBLAS
shall be deemed as the last resort and the usage is discouraged as of writing.
If CMake GUI is used to configure the project, the following options are available.
BUILD_SHARED_LIBS
: If enabled, all libraries will be built as shared libraries.
SP_BUILD_DLL_EXAMPLE
: If enabled, example element/material/section implemented as external libraries will be built.
This is not required for the successful build of the main application.
SP_BUILD_PARALLEL
: If enabled, TBB
will be used for multithreading so that element update, global matrix
assembly, etc., can be parallelized. OpenMP
is not controlled by this option given that OpenMP
support is
available in major platforms. It will be used for low level parallelization such as linear algebra operations (which
is controlled by Armadillo
), matrix solving (which is controlled by various solvers). Thus, this flag only controls
the suanPan
application itself.
SP_ENABLE_VTK
: If enabled, VTK
will be used to provide support for visualization. It will be useful to
generate .vtk
files that can be used in Paraview
for post-processing. If enabled, VTK_DIR
needs to be set to
the path of VTK
installation. For example, VTK_DIR=/usr/local/opt/vtk/lib/cmake/vtk-9.1
.
SP_ENABLE_CUDA
: CUDA
needs to be installed manually by the user. If enabled, CUDA
based solvers will be
available. However, for dense matrix storage, only full matrix storage scheme is supported by CUDA
. Note full
matrix storage scheme is not favourable for FEM. It can, however, be used for sparse matrix solving and mixed
precision solving.
SP_ENABLE_MAGMA
: MAGMA
needs to be installed manually by the user. If enabled, MAGMA
based solvers will be
available. The variable MAGMAROOT
needs to be set to find necessary files.
SP_ENABLE_ASAN
: If enabled, address sanitizer will be enabled. This is useful for debugging purposes.
SP_ENABLE_CODECOV
: If enabled, compile options will be enabled to support code coverage report.
SP_ENABLE_AVX
: If enabled, compiler flags -mavx
or /arch:AVX
will be used. (~2011)
SP_ENABLE_AVX2
: If enabled, compiler flags -mavx2
or /arch:AVX2
will be used. (~2013)
SP_ENABLE_AVX512
: If enabled, compiler flags -mavx512f
or /arch:AVX512
will be used. (~2016)
SP_ENABLE_TBB_ALLOC
: If enabled, the TBB's memory allocator will be used.
SP_OPENBLAS_PATH
: If assigned, link the designated OpenBLAS
library, otherwise the bundled version will be used.
SP_ENABLE_AOCL
: If enabled, one can use the AOCL
implementation via assigning library pathsAOCL_BLIS_PATH
, AOCL_FLAME_PATH
and AOCL_UTILS_PATH
.
SP_ENABLE_MKL
: MKL
needs to be installed manually by the user. If enabled, MKL
will be used
for linear algebra operations. If SP_ENABLE_MKL
is enabled, set MKLROOT
to the root directory of MKL
installation. For
example, C:/Program Files (x86)/Intel/oneAPI/mkl/latest
or /opt/intel/oneapi/mkl/latest
, also the following
additional options are available.
SP_ENABLE_SHARED_MKL
: If enabled, dynamically linked MKL
libraries will be used. Otherwise, statically linked MKL
libraries will be used, leading to larger binary size but faster execution and fewer dependencies.
SP_ENABLE_IOMP
: If enabled, Intel's OpenMP implementation will be used. Otherwise, Default ones (such as GNU OpenMP
library) will be used.
SP_ENABLE_MPI
: Enabled cluster support via MPI.
SP_ENABLE_64BIT_INDEXING
: Enable 64-bit integer for matrix indexing.
!!! warning
The SP_ENABLE_IOMP
should be switched on/off based on the compilers used.
In principle, there should be only one implementation of OpenMP.
Thus, if the compilers used are gcc/g++/gfortran
, SP_ENABLE_IOMP=OFF
.
If the compilers are icx/icpx/ifx
, SP_ENABLE_IOMP=ON
.
The Spike
solver also does not support mixing OpenMP implementations.
Use with caution if two OpenMP implementations exist.
aarch64
ArchitectureThe aarch64
architecture is supported by the source code.
But one shall prepare the dependencies manually.
As MKL is not available for aarch64
, one shall use OpenBLAS
or AOCL
only.
OpenBLAS
is the only necessary dependency.
All other dependencies are optional.
To use AOCL
, one shall manually compile the blis
and flame
libraries in advance.
To do so, compile the application with the flag -DSP_ENABLE_64BIT_INDEXING=ON
.
!!! note
The bundled OpenBLAS
binaries are built for lp64
only.
If one decides to use OpenBLAS
, a ilp64
version must be provided via SP_OPENBLAS_PATH
.
MKL
provides both versions for lp64
and ilp64
, CMake
will handle the linkage automatically.
To configure the source code, shall be available. Please install it before configuring the source code package.
The linear algebra driver used is . You may want to compile it with the optimal configuration based on the specific machine. Otherwise, precompiled binaries (dynamic platform) are available in this .
Please be aware that MKL is throttled on AMD platforms. Performance comparisons can be seen for example . If you have AMD CPUs please collect more knowledge to determine which linear algebra library is more suitable.
On Windows, Visual Studio with Intel oneAPI toolkit is recommended. Alternatively, can be used if GCC compilers are preferred.
Download the source code archive from GitHub or the latest .
The manual compilation is not difficult in general. The CI/CD configuration files can be referred to if you wish. Please check page. Here some general guidelines are given.
This contains some precompiled libraries used.
For versions other than 9.4, names of the linked libraries shall be manually changed as they contain version numbers. Thus, it is not a good idea to switch to a different version. Precompiled VTK library is also available in this .
The following instructions are based on Ubuntu 22.04. is used to manage builds. It is recommended to use CMake GUI if appropriate.
Create build folder and configure via CMake. The default configuration disables parallelism -DSP_BUILD_PARALLEL=OFF
and enables HDF5 via bundled library -DSP_ENABLE_HDF5=ON
. Please
check file or use GUI for available
options.
The following guide is a manual installation is based on Ubuntu terminal using the official repository. See for details.
Intel also provides a repository to install MKL via dnf
.
See
for details.
MKL
can be installed if necessary. See
this
.
From my very personal experience, OpenBLAS
is okay for small matrices on a few cores, but could be slow when the size hits some threshold.
This is also seen in this .
Intel oneAPI MKL
is exclusively optimised for Intel CPUs, and may throttle on other platforms.
This show clear differences on AMD platforms.
SP_ENABLE_HDF5
: If enabled, HDF5
will be used to provide support for .
The following command is used to compile the program to be distributed via snap. See this .
The FEM framework part itself uses unsigned
32-bit integer for indexing, this means it supports up to 4 billion objects, for example, nodes and elements.
This is sufficient at least for the years coming.
But the linear algebra driver may quickly hit the size limit.
For builds, the indexing capability of a default Fortran integer is around 2 billion.
If the problem is large enough, the global matrix may have more than 2 billion entries (either dense or sparse storage).
In this case, it is necessary to using 64-bit indexing, see model.
!!! note
If SP_ENABLE_MPI=ON
is also enabled, one shall make sure the linked MPI is also compiled with 64-bit integer support. is known to work well with either 32-bit or 64-bit integer size.