suanPan-manual
  • Introduction
  • Basic
    • Obtain Application
    • Configure Application
    • Perform Analysis
    • Model Syntax
    • Model Structure
    • Tweak Performance
    • Compile Application
    • Build Documentation
    • Architecture Design
    • On Clusters
  • Example
    • Developer
      • element template
      • material template
    • Solid
      • wave propagation
    • Geotechnical
      • triaxial compression of sand
      • slope analysis
    • Structural
      • Statics
        • bending of a cantilever beam
        • bifurcation of a cantilever beam
        • double-edge notched specimen
        • lees frame
        • notched beam under cyclic loading
        • rc section analysis
        • truss roof
        • uniform tension of a rubber specimen
        • thin-walled section analysis for frame structures
        • calibration of subloading surface model
      • Dynamics
        • bouncing of a ball
        • mass-spring-dashpot system
        • dynamic analysis of a portal frame
        • elemental damping
        • particle collision
        • response history analysis of an elastic coupled wall
        • multi-support excitation
        • triple pendulum
        • computing response spectrum
        • integrate with python
        • process ground motion
      • Hybrid
        • vibration of a displaced beam
      • Buckling
        • buckling analysis of a cantilever beam
      • Contact
        • contact between beam and block
        • contact in 3d space
      • Optimization
        • evolutionary structural optimization
      • Isogeometric Analysis
        • linear analysis of a single element
    • Miscellaneous
      • batch execution for automation
  • Command Collection
    • Define
      • amplitude
      • bc
      • domain
      • element
      • expression
      • file
      • generate
      • group
      • import
      • initial
      • load
      • material
      • modifier
      • node
      • recorder
      • section
    • Configure
      • analyze
      • converger
      • criterion
      • integrator
      • precheck
      • step
    • Process
      • benchmark
      • clear
      • command
      • enable
      • exit
      • materialtest
      • materialtestbyload
      • sectiontest
      • peek
      • plot
      • protect
      • pwd
      • reset
      • save
      • set
      • upsampling
      • sdof_response
      • response_spectrum
  • Amplitude
    • Amplitude
    • Special
      • NZStrongMotion
    • Universal
      • Combine
      • Constant
      • Decay
      • Linear
      • Modulated
      • Tabular
      • TabularSpline
      • Trig
  • Constraint
    • MPC
    • ParticleCollision
    • RigidWall
    • RestitutionWall
    • FixedLength
    • MaxForce
    • NodeLine
    • NodeFacet
    • Embed2D
    • Embed3D
    • LJPotential2D
    • MaximumGap2D
    • MinimumGap2D
    • MaximumGap3D
    • MinimumGap3D
  • Converger
    • Converger
    • Absolute
      • AbsDisp
      • AbsError
      • AbsIncreDisp
      • AbsIncreAcc
      • AbsIncreEnergy
      • AbsResidual
    • Other
      • FixedNumber
      • Logic
    • Relative
      • RelDisp
      • RelError
      • RelIncreDisp
      • RelIncreAcc
      • RelIncreEnergy
      • RelResidual
  • Criterion
    • Criterion
    • MaxDisplacement
    • MaxHistory
    • MaxResistance
    • MinDisplacement
    • MinResistance
    • StrainEnergyEvolution
  • Element
    • Beam
      • B21
      • B21E
      • B21H
      • B31
      • B31OS
      • EB21
      • EB31OS
      • F21
      • F21H
      • F31
      • NMB21
      • NMB21E
      • NMB31
      • MVLEM
      • Orientation
    • Cube
      • C3D20
      • C3D4
      • C3D8
      • C3D8I
      • CIN3D8
      • DC3D4
      • DC3D8
    • Membrane
      • Couple Stress
      • Phase Field
        • DCP3
        • DCP4
      • Axisymmetric
        • CAX3
        • CAX4
        • CAX8
      • Plane
        • CP3
        • CP4
        • CP4I
        • CP5
        • CP6
        • CP7
        • CP8
      • Mixed
        • PS
        • QE2
      • Drilling
        • Allman
        • GCMQ
        • GQ12
      • Infinite
        • CINP4
      • Geotechnical
        • PCPE4DC
        • PCPE4UC
        • PCPE8DC
        • PCPE8UC
      • Membrane
    • Modifier
      • Modifier
      • ElementalLee
      • ElementalNonviscous
      • LinearViscosity
    • Patch
      • Patch
      • PatchCube
      • PatchQuad
    • Plate
      • DKT3
      • DKT4
      • Mindlin
    • Shell
      • DKTS3
      • DKTS4
      • S4
      • SGCMS
      • ShellBase
    • Special
      • Contact2D
      • Contact3D
      • Damper01
      • Damper02
      • Embedded2D
      • Embedded3D
      • Joint
      • Mass
      • SingleSection
      • Spring01
      • Spring02
      • Tie
      • TranslationConnector
    • Truss
      • T2D2
      • T2D2S
      • T3D2
      • T3D2S
  • Group
    • CustomNodeGroup
    • NodeGroup
    • ElementGroup
    • GroupGroup
  • Integrator
    • Implicit
      • Linear
      • BatheTwoStep
      • GeneralizedAlpha
      • OALTS
      • GSSSS
      • Newmark
        • LeeNewmark
        • LeeElementalNewmark
        • LeeNewmarkFull
        • LeeNewmarkIterative
        • Newmark
        • RayleighNewmark
        • WilsonPenzienNewmark
        • NonviscousNewmark
    • Explicit
      • Tchamwa
      • BatheExplicit
      • GeneralizedAlphaExplicit
  • Material
    • Guide
      • Metal
      • Customisation
    • Material1D
      • Concrete
        • ConcreteCM
        • ConcreteExp
        • ConcreteTsai
        • ConcreteTable
        • ConcreteK4
      • Degradation
        • Degradation
        • CustomStrainDegradation
        • CustomStressDegradation
        • Dhakal
        • TrilinearStrainDegradation
      • Elastic
        • BilinearElastic1D
        • Elastic1D
        • AsymmElastic1D
        • MultilinearElastic1D
        • PolyElastic1D
        • NLE1D01
        • Sinh1D
        • Tanh1D
        • CustomElastic1D
      • Hysteresis
        • AFC
        • AFCN
        • BilinearOO
        • BilinearPO
        • BoucWen
        • BWBN
        • Flag
        • MPF
        • MultilinearOO
        • MultilinearPO
        • RambergOsgood
        • SimpleHysteresis
        • SlipLock
        • SteelBRB
        • Trivial
        • Gap01
      • Viscosity
        • Kelvin
        • Maxwell
        • NonlinearViscosity
        • BilinearViscosity
        • CustomViscosity
        • Viscosity01
        • Viscosity02
        • CoulombFriction
        • Nonviscous01
      • vonMises
        • Subloading1D
        • ArmstrongFrederick1D
        • AFCO1D
        • Bilinear1D
        • BilinearMises1D
        • CustomGurson1D
        • CustomMises1D
        • ExpGurson1D
        • ExpMises1D
        • Mises1D
        • Multilinear1D
        • NonlinearGurson1D
        • VAFCRP1D
    • Material2D
      • AxisymmetricElastic
      • Concrete21
      • Concrete22
      • DuncanSelig
      • Elastic2D
      • Rebar2D
    • Material3D
      • CamClay
        • BilinearCC
        • ExpCC
        • NonlinearCamClay
        • ParabolicCC
      • Concrete
        • CDP
        • CDPM2
        • Rebar3D
        • TableCDP
        • CustomCDP
      • Damage
        • IsotropicDamage
        • LinearDamage
      • DruckerPrager
        • BilinearDP
        • ExpDP
        • CustomDP
        • NonlinearDruckerPrager
      • Elastic
        • BlatzKo
        • IsotropicElastic3D
        • IsotropicNonlinearElastic3D
        • MooneyRivlin
        • NLE3D01
        • OrthotropicElastic3D
        • Yeoh
      • Hoffman
        • BilinearHoffman
        • ExpHoffman
        • CustomHoffman
        • NonlinearHill
        • NonlinearHoffman
        • TimberPD
      • Sand
        • SimpleSand
        • DafalisaManzari
      • vonMises
        • ArmstrongFrederick
        • BilinearJ2
        • BilinearPeric
        • CustomGurson
        • TableGurson
        • CustomJ2
        • ExpGurson
        • ExpJ2
        • MultilinearJ2
        • NonlinearGurson
        • NonlinearJ2
        • NonlinearPeric
        • PolyJ2
        • VAFCRP
        • Subloading
    • MaterialOS
      • ElasticOS
    • Wrapper
      • Axisymmetric
      • Laminated
      • Parallel
      • PlaneStrain
      • PlaneSymmetric
      • PlaneStress
      • Rotation2D
      • Rotation3D
      • Sequential
      • Stacked
      • Uniaxial
      • OS146
      • OS146S
      • Substepping
  • Recorder
    • Recorder
    • OutputType
  • Section
    • Code
      • EU
      • NZ
      • US
    • Section1D
      • Circle1D
      • Fibre1D
      • Rectangle1D
      • TrussSection
    • Section2D
      • Bar2D
      • Box2D
      • Circle2D
      • CircularHollow2D
      • Fibre2D
      • HSection2D
      • ISection2D
      • Rectangle2D
      • TSection2D
    • Section3D
      • Bar3D
      • Box3D
      • Circle3D
      • CircularHollow3D
      • Fibre3D
      • ISection3D
      • Rectangle3D
      • TSection3D
    • SectionOS
      • Cell3DOS
      • Fibre3DOS
    • SectionNM
      • SectionNM
      • NM2D1
      • NM2D2
      • NM2D3
      • NM2D3K
      • NM3D1
      • NM3D2
      • NM3D3
      • NM3D3K
  • Solver
    • BFGS
    • MPDC
    • Newton
    • AICN
    • Ramm
  • Step
    • Overview
    • ArcLength
    • Buckle
    • Dynamic
    • Frequency
    • Optimization
    • Static
  • Developer
    • Prerequisites
    • C Style Interface
      • material
    • CPP Style Interface
      • material
      • element
      • constraint
Powered by GitBook
On this page
  • With Docker
  • Dev Environment Images
  • Docker Images
  • Without Docker
  • Prerequisites
  • Toolsets
  • Obtain Source Code
  • Configure and Compile
  • Linear Algebra Driver
  • Build Options
  • Example Configuration
  • aarch64 Architecture
  • 64-bit Indexing
Edit on GitHub
  1. Basic

Compile Application

PreviousTweak PerformanceNextBuild Documentation

Last updated 3 days ago

With Docker

Dev Environment Images

Two images are provided for development purposes.

# vtk+mkl+cuda
docker pull tlcfem/suanpan-env-cuda
# vtk+mkl
docker pull tlcfem/suanpan-env

They can be used as development containers. VS Code and CLion can be configured to use these containers for development. There is no need to install any dependencies on the host machine.

The tlcfem/suanpan-env image also supports arm64, in which OpenBLAS is used as the linear algebra driver.

On AMD platforms, it is known that MKL may throttle thus yields a poor performance, it may be necessary to use a specific version of OpenBLAS or instead.

Docker Images

It is possible to compile the project with Docker. Check the provided for more information. One can build the image using the example Dockerfile as it is. For example,

# the current folder contains the file Rocky.Dockerfile
docker build -t suanpan -f ./Rocky.Dockerfile .

Once the image is built, run the container and use sp, suanpan or suanPan to invoke the program in the container. Maybe it is necessary to map some folders to the container.

docker run -it --rm suanpan
# now in the container
suanpan -v

Docker files provide a standard reproducible environment and a reference configuration. One can always introduce adaptions to cater various needs.

Without Docker

The following is a general guide that covers three main operating systems. It mainly targets the amd64 architecture.

Prerequisites

  1. It is strongly recommended installing Intel MKL for potentially better performance.

Toolsets

A number of new features from new standards are utilized. To compile the binary, a compiler that supports C++20 is required.

GCC 11, Clang 13, MSVC 14.3, Intel compilers and later version of those compilers are tested with the source code.

On other platforms (Linux and macOS), simply use GCC which comes with a valid Fortran compiler. Clang can also be used for C/CPP code, but since Clang and GCC have different supports for C++ new standards, successful compilation is not guaranteed with Clang.

Obtain Source Code

Configure and Compile

Windows (Visual Studio)

This is highly tailored to my own machine. Thus, it is not recommended to use it directly. Instead, use VS Code with CMake extension to automatically configure the project.

A solution file is provided under MSVC/suanPan folder. There are two configurations:

  1. Debug: Assume no available Fortran compiler, all Fortran related libraries are provided as precompiled DLLs. Use OpenBLAS for linear algebra. Multithreading disabled. Visualisation disabled. HDF5 support disabled.

  2. Release: Fortran libraries are configured with Intel compilers. Use MKL for linear algebra. Multithreading enabled. Visualisation enabled with VTK version 9.4. HDF5 support enabled. CUDA enabled.

If VTK, Intel oneAPI Toolkit and CUDA are not installed, only the Debug configuration can be successfully compiled. Simply open the solution and switch to Debug configuration, ignore all potential warnings and build the solution.

To compile Release version, please

  1. Make sure oneAPI both Base and HPC toolkits, as well as VS integration, are installed. The MKL is enabled via integrated option <UseInteloneMKL>Parallel</UseInteloneMKL>.

  2. Make sure CUDA is installed. The environment variable $(CUDA_PATH) is used to locate headers.

  3. Make sure VTK is available. Then define a system environment variable $(VTK_DIR), which points to the root folder of VTK library. On my machine, it is

    VTK_DIR=C:\Program Files\VTK\
  4. Make sure MAGMA is available. Then define a system environment variable $(MAGMA_DIR), which points to the root folder of MAGMA library. On my machine, it is

    MAGMA_DIR=C:\Program Files\MAGMA\

    You probably need to compile MAGMA yourself. You can manually remove all magma related settings in the solution file if you don't want to use it.

Alternatively, CMake can be used to generate solution files if some external packages are not available.

Windows (Visual Studio Code)

Open the source code folder with VS Code. Whether you choose GCC or MSVC, the configuration is done by CMake automatically.

Ubuntu

  1. Install necessary tools.

    sudo apt-get install gcc g++ gfortran git cmake libomp5 libglvnd-dev -y
  2. Clone the project.

    git clone -b master --depth 1 https://github.com/TLCFEM/suanPan.git
  3. cd suanPan && mkdir build && cd build
    cmake ../
  4. Invoke make.

    make -j"$(nproc)"

Check the following recording.

Install VTK

Ubuntu official repository does not (Fedora does!) contain the latest VTK library. It's better to compile it manually.

  1. Install OpenGL first, as well as compilers if necessary.

    sudo apt install gcc-10 g++-10 gfortran-10 libglvnd-dev
  2. Obtain VTK source code and unpack.

    wget https://www.vtk.org/files/release/9.1/VTK-9.1.0.tar.gz
    tar -xf VTK-9.1.0.tar.gz
  3. Create folder for building VTK.

    mkdir VTK-build && cd VTK-build
  4. Configure and compile VTK library. If necessary, installation destination can be modified. Here static libraries are built.

    cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=../VTK-out ../VTK-9.1.0
    make install -j4
  5. Now obtain suanPan source code and unpack it. To configure it with VTK support, users may use the following flag -DSP_ENABLE_VTK=ON. If FindVTK is presented and VTK is installed to default location, there is no need to provide the variable VTK_DIR, otherwise point it to the lib/cmake/vtk-9.1 folder.

Install MKL

The provided CMake configuration covers both oneMKL and Intel MKL 2020. Please note MKL is included in oneAPI toolkit starting from 2021, which has a different folder structure compared to Intel Parallel Studio.

  1. Add repository. To summarise,

    wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
    sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
    echo "deb https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list 
  2. Install the package.

    sudo apt update && sudo apt install intel-oneapi-mkl-devel -y
  3. Now compile suanPan by enabling MKL via option -DSP_ENABLE_MKL=ON. The corresponding MKLROOT shall be assigned, for example -DMKLROOT=/opt/intel/oneapi/mkl/latest/, depending on the installation location. The configuration used for snap is the following one.

    -DCMAKE_BUILD_TYPE=Release
    -DCMAKE_INSTALL_PREFIX=
    -DMKLROOT=/opt/intel/oneapi/mkl/latest
    -DSP_BUILD_PARALLEL=ON
    -DSP_ENABLE_HDF5=ON
    -DSP_ENABLE_IOMP=OFF
    -DSP_ENABLE_MKL=ON
    -DSP_ENABLE_SHARED_MKL=OFF
    -DSP_ENABLE_VTK=ON
    -DVTK_DIR=$CRAFT_PART_BUILD/lib/cmake/vtk-9.4/

Fedora

VTK

Fedora offers the latest VTK library, simply install it.

sudo dnf install vtk-devel

MKL

First, create the repo file.

tee > /tmp/oneAPI.repo << EOF
[oneAPI]
name=Intel® oneAPI repository
baseurl=https://yum.repos.intel.com/oneapi
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
EOF

Move it to the proper location.

sudo mv /tmp/oneAPI.repo /etc/yum.repos.d

Install MKL. You may perform a search sudo dnf search intel-oneapi-mkl-devel to find which package name is available and install the specific version if necessary.

sudo dnf install intel-oneapi-mkl-devel

The source can be compiled with VTK and MKL enabled.

macOS

The following guide is based on macOS Big Sur (11).

Install tools. gfortran, llvm and libomp are used for compiling the main program, glfw and glew are required for compiling VTK. VTK does not compile with GCC. Here, we use Clang.

# install brew if not installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# install necessary packages
brew install gcc@10 llvm@13 libomp glfw glew git cmake

Similar to Ubuntu, compile VTK if wanted.

wget https://www.vtk.org/files/release/9.1/VTK-9.1.0.tar.gz
tar xf VTK-9.1.0.tar.gz && rm VTK-9.1.0.tar.gz
mkdir VTK-build && cd VTK-build
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=../VTK-out ../VTK-9.1.0
make install -j4

Obtain the source code and configure.

# clone source code
git clone -b master https://github.com/TLCFEM/suanPan.git
# create build directory
mkdir suanpan-build && cd suanpan-build
# use clang, clang++ and gfortran
export CC=/usr/local/opt/llvm/bin/clang && export CXX=/usr/local/opt/llvm/bin/clang++ && export FC=gfortran-10
# configure project
cmake -DCMAKE_BUILD_TYPE=Release -DSP_BUILD_PARALLEL=ON -DSP_ENABLE_HDF5=ON -DSP_ENABLE_VTK=ON -DVTK_DIR=../VTK-out/lib/cmake/vtk-9.1/ .
# compile
make -j4

Linear Algebra Driver

Any standard BLAS and LAPACK implementation can be used as the linear algebra driver which the application itself and Armadillo rely on. Currently, the following are supported and tested.

As a general guideline, when it comes to choose a proper implementation, the following points shall be considered.

  1. AMD Optimizing CPU Libraries (AOCL) is optimised for AMD CPUs based on BLIS and FLAME libraries, it performance on both platforms (and others) is superb.

Thus, use Intel oneAPI MKL if it is preferred or an Intel platform is targeted. The downside is that Intel oneAPI MKL is proprietary and the final binary may have a large size. For other cases, use AMD Optimizing CPU Libraries (AOCL) when possible. The downside it that it may need manual compilation of the libraries for the target OS.OpenBLAS shall be deemed as the last resort and the usage is discouraged as of writing.

Build Options

If CMake GUI is used to configure the project, the following options are available.

  1. BUILD_SHARED_LIBS: If enabled, all libraries will be built as shared libraries.

  2. SP_BUILD_DLL_EXAMPLE: If enabled, example element/material/section implemented as external libraries will be built. This is not required for the successful build of the main application.

  3. SP_BUILD_PARALLEL: If enabled, TBB will be used for multithreading so that element update, global matrix assembly, etc., can be parallelized. OpenMP is not controlled by this option given that OpenMP support is available in major platforms. It will be used for low level parallelization such as linear algebra operations (which is controlled by Armadillo), matrix solving (which is controlled by various solvers). Thus, this flag only controls the suanPan application itself.

  4. SP_ENABLE_VTK: If enabled, VTK will be used to provide support for visualization. It will be useful to generate .vtk files that can be used in Paraview for post-processing. If enabled, VTK_DIR needs to be set to the path of VTK installation. For example, VTK_DIR=/usr/local/opt/vtk/lib/cmake/vtk-9.1.

  5. SP_ENABLE_CUDA: CUDA needs to be installed manually by the user. If enabled, CUDA based solvers will be available. However, for dense matrix storage, only full matrix storage scheme is supported by CUDA. Note full matrix storage scheme is not favourable for FEM. It can, however, be used for sparse matrix solving and mixed precision solving.

  6. SP_ENABLE_MAGMA: MAGMA needs to be installed manually by the user. If enabled, MAGMA based solvers will be available. The variable MAGMAROOT needs to be set to find necessary files.

  7. SP_ENABLE_ASAN: If enabled, address sanitizer will be enabled. This is useful for debugging purposes.

  8. SP_ENABLE_CODECOV: If enabled, compile options will be enabled to support code coverage report.

  9. SP_ENABLE_AVX: If enabled, compiler flags -mavx or /arch:AVX will be used. (~2011)

  10. SP_ENABLE_AVX2: If enabled, compiler flags -mavx2 or /arch:AVX2 will be used. (~2013)

  11. SP_ENABLE_AVX512: If enabled, compiler flags -mavx512f or /arch:AVX512 will be used. (~2016)

  12. SP_ENABLE_TBB_ALLOC: If enabled, the TBB's memory allocator will be used.

  13. SP_OPENBLAS_PATH: If assigned, link the designated OpenBLAS library, otherwise the bundled version will be used.

  14. SP_ENABLE_AOCL: If enabled, one can use the AOCL implementation via assigning library path SP_AOCL_PATH.

  15. SP_ENABLE_MKL: MKL needs to be installed manually by the user. If enabled, MKL will be used for linear algebra operations. If SP_ENABLE_MKL is enabled, set MKLROOT to the root directory of MKL installation. For example, C:/Program Files (x86)/Intel/oneAPI/mkl/latest or /opt/intel/oneapi/mkl/latest, also the following additional options are available.

  16. SP_ENABLE_SHARED_MKL: If enabled, dynamically linked MKL libraries will be used. Otherwise, statically linked MKL libraries will be used, leading to larger binary size but faster execution and fewer dependencies.

  17. SP_ENABLE_IOMP: If enabled, Intel's OpenMP implementation will be used. Otherwise, Default ones (such as GNU OpenMP library) will be used.

  18. SP_ENABLE_MPI: Enabled cluster support via MPI.

  19. SP_ENABLE_64BIT_INDEXING: Enable 64-bit integer for matrix indexing.

!!! warning The SP_ENABLE_IOMP should be switched on/off based on the compilers used. In principle, there should be only one implementation of OpenMP. Thus, if the compilers used are gcc/g++/gfortran, SP_ENABLE_IOMP=OFF. If the compilers are icx/icpx/ifx, SP_ENABLE_IOMP=ON. The Spike solver also does not support mixing OpenMP implementations. Use with caution if two OpenMP implementations exist.

Example Configuration

# assume current folder is suanPan/build
# the parent folder contains source code
cmake -DCMAKE_INSTALL_PREFIX= \
      -DCMAKE_BUILD_TYPE=Release \
      -DSP_BUILD_PARALLEL=ON \
      -DSP_ENABLE_HDF5=ON \
      -DSP_ENABLE_VTK=ON \
      -DVTK_DIR=$CRAFT_PART_BUILD/lib/cmake/vtk-9.4/ \
      -DSP_ENABLE_MKL=ON \
      -DMKLROOT=/opt/intel/oneapi/mkl/latest \
      -DSP_ENABLE_IOMP=OFF \
      -DSP_ENABLE_SHARED_MKL=OFF

aarch64 Architecture

The aarch64 architecture is supported by the source code. But one shall prepare the dependencies manually. As MKL is not available for aarch64, one shall use OpenBLAS or AOCL only.

OpenBLAS is the only necessary dependency. All other dependencies are optional.

To use AOCL, one shall manually compile the blis and flame libraries in advance.

64-bit Indexing

To do so, compile the application with the flag -DSP_ENABLE_64BIT_INDEXING=ON.

!!! note The bundled OpenBLAS binaries are built for lp64 only. If one decides to use OpenBLAS, a ilp64 version must be provided via SP_OPENBLAS_PATH.

MKL provides both versions for lp64 and ilp64, CMake will handle the linkage automatically.

To configure the source code, shall be available. Please install it before configuring the source code package.

The linear algebra driver used is . You may want to compile it with the optimal configuration based on the specific machine. Otherwise, precompiled binaries (dynamic platform) are available in this .

Please be aware that MKL is throttled on AMD platforms. Performance comparisons can be seen for example . If you have AMD CPUs please collect more knowledge to determine which linear algebra library is more suitable.

On Windows, Visual Studio with Intel oneAPI toolkit is recommended. Alternatively, can be used if GCC compilers are preferred.

Download the source code archive from GitHub or the latest .

The manual compilation is not difficult in general. The CI/CD configuration files can be referred to if you wish. Please check page. Here some general guidelines are given.

This contains some precompiled libraries used.

For versions other than 9.4, names of the linked libraries shall be manually changed as they contain version numbers. Thus, it is not a good idea to switch to a different version. Precompiled VTK library is also available in this .

The following instructions are based on Ubuntu 22.04. is used to manage builds. It is recommended to use CMake GUI if appropriate.

Create build folder and configure via CMake. The default configuration disables parallelism -DSP_BUILD_PARALLEL=OFF and enables HDF5 via bundled library -DSP_ENABLE_HDF5=ON. Please check file or use GUI for available options.

The following guide is a manual installation is based on Ubuntu terminal using the official repository. See for details.

Intel also provides a repository to install MKL via dnf. See for details.

MKL can be installed if necessary. See this .

From my very personal experience, OpenBLAS is okay for small matrices on a few cores, but could be slow when the size hits some threshold. This is also seen in this .

Intel oneAPI MKL is exclusively optimised for Intel CPUs, and may throttle on other platforms. This show clear differences on AMD platforms.

SP_ENABLE_HDF5: If enabled, HDF5 will be used to provide support for .

The following command is used to compile the program to be distributed via snap. See this .

The FEM framework part itself uses unsigned 32-bit integer for indexing, this means it supports up to 4 billion objects, for example, nodes and elements. This is sufficient at least for the years coming. But the linear algebra driver may quickly hit the size limit. For builds, the indexing capability of a default Fortran integer is around 2 billion. If the problem is large enough, the global matrix may have more than 2 billion entries (either dense or sparse storage). In this case, it is necessary to using 64-bit indexing, see model.

!!! note If SP_ENABLE_MPI=ON is also enabled, one shall make sure the linked MPI is also compiled with 64-bit integer support. is known to work well with either 32-bit or 64-bit integer size.

AMD Optimizing CPU Libraries (AOCL)
Dockerfiles
CMake
OpenBLAS
repository
here
WinLibs
Releases
stable code
this
repository
repository
CMake
CMakeLists.txt
this page
this page
page
OpenBLAS
Intel oneAPI MKL
AMD Optimizing CPU Libraries (AOCL)
benchmark
benchmark
hdf5recorder
file
lp64
ilp64
MPICH