Problem with snappyHexMesh when OpenFOAM is compiled using FMA-Instructions
Summary
I found a problem when using snappyHexMesh and a specific grid. In certain machines, the grid is correctly generated, in others, snappyHexMesh hangs in an endless loop.
The problem is correlated to FMA instructions (Fused Multiply-Accumulate) in new CPUs. I have tested using AMD EPYC and Intel Skylake CPUs, and various versions of GCC and Intel Compiler. When compiling the code with –O2 or –O3 and the flag for FMA (-mfma) or the CPU architecture (-march=znver2 or -march=skylake), FMA instructions are turned on, which have a different rounding precision than operations without FMA. Thus a test in particle.C is too strict, and the distance between particles and tet faces is not captured correctly. This creates an endless loop at the stage “Feature refinement iteration 0” of the grid creation.
The operation that uses FMA is the ^ operator used to calculate T in Foam::particle::stationaryTetReverseTransform. The test that fails is in Foam::particle::trackToStationaryTri, line 759 of particle.C (if (Tx1[i] < - detA*SMALL))
.
Steps to reproduce
Compile OpenFOAM with FMA flags on and off (the bugged code is in the src/lagrangian/basic/particle library).
Create the mesh from the file attached (mesh.tgz) using snappyHexMesh. Using the version compiled with FMA flags hangs.
Example case
A small code example with the same operations as the ^ operator is also attached (Test_CrossProduct.cpp). Using separated multiply and addition results in (0 0 0), as the calculations are rounded twice. When using FMA, the result is rounded only once, showing the rounding error of a double variable (e.g. -8.73968e-17 -1.82965e-16 1.82965e-16).
Without FMA
g++ -O3 -o Test_CrossProduct Test_CrossProduct.cpp
chmod 755 Test_CrossProduct
./Test_CrossProduct
With FMA
g++ -mfma -O3 -o Test_CrossProduct Test_CrossProduct.cpp
chmod 755 Test_CrossProduct
./Test_CrossProduct
What is the current bug behaviour?
snappyHexMesh hangs in an endless loop.
What is the expected correct behavior?
snappyHexMesh creates the grid.
Relevant logs and/or images
Environment information
- OpenFOAM version : v2106
- Operating system : CentOS 8
- Hardware info : AMD EPYC, Intel Skylake
- Compiler : GCC and intel
Possible fixes
The solution for this is straightforward, just replacing SMALL with ROOTSMALL, and everything works. The file with the correction is attached.