suanPan-manual
  • Introduction
  • Basic
    • Obtain Application
    • Configure Application
    • Perform Analysis
    • Model Syntax
    • Model Structure
    • Tweak Performance
    • Compile Application
    • Build Documentation
    • Architecture Design
    • On Clusters
  • Example
    • Developer
      • element template
      • material template
    • Solid
      • wave propagation
    • Geotechnical
      • triaxial compression of sand
      • slope analysis
    • Structural
      • Statics
        • bending of a cantilever beam
        • bifurcation of a cantilever beam
        • double-edge notched specimen
        • lees frame
        • notched beam under cyclic loading
        • rc section analysis
        • truss roof
        • uniform tension of a rubber specimen
        • thin-walled section analysis for frame structures
        • calibration of subloading surface model
      • Dynamics
        • bouncing of a ball
        • mass-spring-dashpot system
        • dynamic analysis of a portal frame
        • elemental damping
        • particle collision
        • response history analysis of an elastic coupled wall
        • multi-support excitation
        • triple pendulum
        • computing response spectrum
        • integrate with python
        • process ground motion
      • Hybrid
        • vibration of a displaced beam
      • Buckling
        • buckling analysis of a cantilever beam
      • Contact
        • contact between beam and block
        • contact in 3d space
      • Optimization
        • evolutionary structural optimization
      • Isogeometric Analysis
        • linear analysis of a single element
    • Miscellaneous
      • batch execution for automation
  • Command Collection
    • Define
      • amplitude
      • bc
      • domain
      • element
      • expression
      • file
      • generate
      • group
      • import
      • initial
      • load
      • material
      • modifier
      • node
      • recorder
      • section
    • Configure
      • analyze
      • converger
      • criterion
      • integrator
      • precheck
      • step
    • Process
      • benchmark
      • clear
      • command
      • enable
      • exit
      • materialtest
      • materialtestbyload
      • sectiontest
      • peek
      • plot
      • protect
      • pwd
      • reset
      • save
      • set
      • upsampling
      • sdof_response
      • response_spectrum
  • Amplitude
    • Amplitude
    • Special
      • NZStrongMotion
    • Universal
      • Combine
      • Constant
      • Decay
      • Linear
      • Modulated
      • Tabular
      • TabularSpline
      • Trig
  • Constraint
    • MPC
    • ParticleCollision
    • RigidWall
    • RestitutionWall
    • FixedLength
    • MaxForce
    • NodeLine
    • NodeFacet
    • Embed2D
    • Embed3D
    • LJPotential2D
    • MaximumGap2D
    • MinimumGap2D
    • MaximumGap3D
    • MinimumGap3D
  • Converger
    • Converger
    • Absolute
      • AbsDisp
      • AbsError
      • AbsIncreDisp
      • AbsIncreAcc
      • AbsIncreEnergy
      • AbsResidual
    • Other
      • FixedNumber
      • Logic
    • Relative
      • RelDisp
      • RelError
      • RelIncreDisp
      • RelIncreAcc
      • RelIncreEnergy
      • RelResidual
  • Criterion
    • Criterion
    • MaxDisplacement
    • MaxHistory
    • MaxResistance
    • MinDisplacement
    • MinResistance
    • StrainEnergyEvolution
  • Element
    • Beam
      • B21
      • B21E
      • B21H
      • B31
      • B31OS
      • EB21
      • EB31OS
      • F21
      • F21H
      • F31
      • NMB21
      • NMB21E
      • NMB31
      • MVLEM
      • Orientation
    • Cube
      • C3D20
      • C3D4
      • C3D8
      • C3D8I
      • CIN3D8
      • DC3D4
      • DC3D8
    • Membrane
      • Couple Stress
      • Phase Field
        • DCP3
        • DCP4
      • Axisymmetric
        • CAX3
        • CAX4
        • CAX8
      • Plane
        • CP3
        • CP4
        • CP4I
        • CP5
        • CP6
        • CP7
        • CP8
      • Mixed
        • PS
        • QE2
      • Drilling
        • Allman
        • GCMQ
        • GQ12
      • Infinite
        • CINP4
      • Geotechnical
        • PCPE4DC
        • PCPE4UC
        • PCPE8DC
        • PCPE8UC
      • Membrane
    • Modifier
      • Modifier
      • ElementalLee
      • ElementalNonviscous
      • LinearViscosity
    • Patch
      • Patch
      • PatchCube
      • PatchQuad
    • Plate
      • DKT3
      • DKT4
      • Mindlin
    • Shell
      • DKTS3
      • DKTS4
      • S4
      • SGCMS
      • ShellBase
    • Special
      • Contact2D
      • Contact3D
      • Damper01
      • Damper02
      • Embedded2D
      • Embedded3D
      • Joint
      • Mass
      • SingleSection
      • Spring01
      • Spring02
      • Tie
      • TranslationConnector
    • Truss
      • T2D2
      • T2D2S
      • T3D2
      • T3D2S
  • Group
    • CustomNodeGroup
    • NodeGroup
    • ElementGroup
    • GroupGroup
  • Integrator
    • Implicit
      • Linear
      • BatheTwoStep
      • GeneralizedAlpha
      • OALTS
      • GSSSS
      • Newmark
        • LeeNewmark
        • LeeElementalNewmark
        • LeeNewmarkFull
        • LeeNewmarkIterative
        • Newmark
        • RayleighNewmark
        • WilsonPenzienNewmark
        • NonviscousNewmark
    • Explicit
      • Tchamwa
      • BatheExplicit
      • GeneralizedAlphaExplicit
  • Material
    • Guide
      • Metal
      • Customisation
    • Material1D
      • Concrete
        • ConcreteCM
        • ConcreteExp
        • ConcreteTsai
        • ConcreteTable
        • ConcreteK4
      • Degradation
        • Degradation
        • CustomStrainDegradation
        • CustomStressDegradation
        • Dhakal
        • TrilinearStrainDegradation
      • Elastic
        • BilinearElastic1D
        • Elastic1D
        • AsymmElastic1D
        • MultilinearElastic1D
        • PolyElastic1D
        • NLE1D01
        • Sinh1D
        • Tanh1D
        • CustomElastic1D
      • Hysteresis
        • AFC
        • AFCN
        • BilinearOO
        • BilinearPO
        • BoucWen
        • BWBN
        • Flag
        • MPF
        • MultilinearOO
        • MultilinearPO
        • RambergOsgood
        • SimpleHysteresis
        • SlipLock
        • SteelBRB
        • Trivial
        • Gap01
      • Viscosity
        • Kelvin
        • Maxwell
        • NonlinearViscosity
        • BilinearViscosity
        • CustomViscosity
        • Viscosity01
        • Viscosity02
        • CoulombFriction
        • Nonviscous01
      • vonMises
        • Subloading1D
        • ArmstrongFrederick1D
        • AFCO1D
        • Bilinear1D
        • BilinearMises1D
        • CustomGurson1D
        • CustomMises1D
        • ExpGurson1D
        • ExpMises1D
        • Mises1D
        • Multilinear1D
        • NonlinearGurson1D
        • VAFCRP1D
    • Material2D
      • AxisymmetricElastic
      • Concrete21
      • Concrete22
      • DuncanSelig
      • Elastic2D
      • Rebar2D
    • Material3D
      • CamClay
        • BilinearCC
        • ExpCC
        • NonlinearCamClay
        • ParabolicCC
      • Concrete
        • CDP
        • CDPM2
        • Rebar3D
        • TableCDP
        • CustomCDP
      • Damage
        • IsotropicDamage
        • LinearDamage
      • DruckerPrager
        • BilinearDP
        • ExpDP
        • CustomDP
        • NonlinearDruckerPrager
      • Elastic
        • BlatzKo
        • IsotropicElastic3D
        • IsotropicNonlinearElastic3D
        • MooneyRivlin
        • NLE3D01
        • OrthotropicElastic3D
        • Yeoh
      • Hoffman
        • BilinearHoffman
        • ExpHoffman
        • CustomHoffman
        • NonlinearHill
        • NonlinearHoffman
        • TimberPD
      • Sand
        • SimpleSand
        • DafalisaManzari
      • vonMises
        • ArmstrongFrederick
        • BilinearJ2
        • BilinearPeric
        • CustomGurson
        • TableGurson
        • CustomJ2
        • ExpGurson
        • ExpJ2
        • MultilinearJ2
        • NonlinearGurson
        • NonlinearJ2
        • NonlinearPeric
        • PolyJ2
        • VAFCRP
        • Subloading
    • MaterialOS
      • ElasticOS
    • Wrapper
      • Axisymmetric
      • Laminated
      • Parallel
      • PlaneStrain
      • PlaneSymmetric
      • PlaneStress
      • Rotation2D
      • Rotation3D
      • Sequential
      • Stacked
      • Uniaxial
      • OS146
      • OS146S
      • Substepping
  • Recorder
    • Recorder
    • OutputType
  • Section
    • Code
      • EU
      • NZ
      • US
    • Section1D
      • Circle1D
      • Fibre1D
      • Rectangle1D
      • TrussSection
    • Section2D
      • Bar2D
      • Box2D
      • Circle2D
      • CircularHollow2D
      • Fibre2D
      • HSection2D
      • ISection2D
      • Rectangle2D
      • TSection2D
    • Section3D
      • Bar3D
      • Box3D
      • Circle3D
      • CircularHollow3D
      • Fibre3D
      • ISection3D
      • Rectangle3D
      • TSection3D
    • SectionOS
      • Cell3DOS
      • Fibre3DOS
    • SectionNM
      • SectionNM
      • NM2D1
      • NM2D2
      • NM2D3
      • NM2D3K
      • NM3D1
      • NM3D2
      • NM3D3
      • NM3D3K
  • Solver
    • BFGS
    • MPDC
    • Newton
    • AICN
    • Ramm
  • Step
    • Overview
    • ArcLength
    • Buckle
    • Dynamic
    • Frequency
    • Optimization
    • Static
  • Developer
    • Prerequisites
    • C Style Interface
      • material
    • CPP Style Interface
      • material
      • element
      • constraint
Powered by GitBook
On this page
  • Prerequisites
  • Configuration
Edit on GitHub
  1. Basic

On Clusters

PreviousArchitecture DesignNextDeveloper

Last updated 2 months ago

The element state determination, as well as solving the global system, can be distributed over a process grid. By enabling MPI, the application can be executed on clusters. The architecture design is explained in page.

There are a few caveats. Since the distributed solvers for linear systems rely on external libraries, typically an implementation of ScaLAPACK. There are not many choices, one can use the reference implementation, which may not have a superb performance. One can also use AMD's or Intel's .

Prerequisites

Considering the procedure of manually compiling dependencies is cumbersome, we only support Intel's ecosystem for the moment. That means, MKL must be enabled via the option -DSP_ENABLE_MKL=ON.MKL can be installed a priori.

For example, according to the official documentation, with APT, it can be installed via the following commands.

sudo apt update
sudo apt install -y gpg-agent wget
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
sudo apt install intel-oneapi-mkl-devel

It is likely that Intel's MPI library is also required, otherwise a working implementation shall be pre-installed. Note, the MPI bundled by your distro may not work.

sudo apt install intel-oneapi-mpi-devel

Configuration

The minimum configuration requires two flags: -DSP_ENABLE_MKL=ON and -DSP_ENABLE_MPI=ON.CMake will try to locate the installation of MKL and automatically configure the project. To ensure everything works as intended, it may be necessary to activate the oneAPI environment.

source /opt/intel/oneapi/setvars.sh

This can be done via, for example, environment file in the toolchain configuration in CLion, environmentSetupScript in in VS Code.

By default, it will use Intel's MPI. If another implementation, for example, MPICH and/or OpenMPI, is used, use MPI_HOME to override the path, for example, -DMPI_HOME=~/Documents/OpenMPI.

The project tries to automatically detect the vendor of MPI via its path, but it may fail. You may need to set MKL_MPI to a proper value. The available values can be found in /opt/intel/oneapi/mkl/latest/lib/cmake/mkl/MKLConfig.cmake. For example, on my machine, there are

# MKL_MPI
#    Values:  intelmpi, mpich, openmpi, msmpi, mshpc
#    Default: intelmpi

!!! note If two different implementations of MPI are used, the compilation may succeed, the application can still crash on execution.

Typically, for problems that need to be run on clusters, the default 32-bit indexing is not sufficient. The limit of a 32-bit integer is slightly above 2 billion, meaning that, if the global matrix is stored in dense full format, the maximum size is 46340. If each node has 2 DoFs (2D node with translational DoFs only), this corresponds to 23170 nodes. If each node has 6 DoFs (3D node with both translational and rotational DoFs), this corresponds to 7723 nodes. But in the context of FEM, the full storage is rarely used, so this is the lower bound. Considering the banded storage, typically the bandwidth is a small fraction of the global size, say, 1%, then the lower bound should be at least 800000 3D shell nodes.

If the problem is sizeable, it is possible to enable 64-bit indexing via -DSP_ENABLE_64BIT_INDEXING=ON. This will link ilp64 version of MKL, and compile all dependencies with the proper settings.

this
implementation
implementation
cmake-kits.json