Dear Fipy community,
I am running Fipy in parallel with Trilinos on a single cluster node. I am finding that I get the expected results when running as one process, but incorrect ones when doing so across multiple cores. The script below (and attached as "problem.py") demonstrates this:
problem.py:
'''
import numpy as np
from fipy import *
import sys
from PyTrilinos import Epetra
np.set_printoptions(threshold=sys.maxsize)
mesh = GmshGrid3D(0.001, 0.001, 0.001, 10, 10, 10)
X = mesh.cellCenters[0]
T = CellVariable(mesh=mesh, value=20.)
alpha = CellVariable(mesh=mesh, value=2.88e-6)
T.setValue(100, where=(X >= 0.5*max(X)))
eq = TransientTerm() == DiffusionTerm(coeff=alpha)
parallelComm.Barrier()
sys.stdout.write(
("\n PID = "+str(Epetra.PyComm().MyPID())+"; "
+str(mesh.numberOfCells)+
" elements on processor "+str(parallelComm.procID)+
" of "+str(parallelComm.Nproc)))
for step in range(3):
eq.solve(T, dt=0.75)
parallelComm.Barrier()
Tg = T.globalValue
if parallelComm.procID == 0:
Tr = np.reshape(Tg, (10, 10, 10))
sys.stdout.write(("\n"+str(np.round(Tr[:, :, 9], 0))+"\n"))
'''
Output from sequential serial and parallel runs:
'''
Serial process:
PID = 0; 1000 elements on processor 0 of 1
[[32. 32. 32. 32. 32. 32. 32. 32. 32. 32.]
[35. 35. 35. 35. 35. 35. 35. 35. 35. 35.]
[39. 39. 39. 39. 39. 39. 39. 39. 39. 39.]
[46. 46. 46. 46. 46. 46. 46. 46. 46. 46.]
[55. 55. 55. 55. 55. 55. 55. 55. 55. 55.]
[65. 65. 65. 65. 65. 65. 65. 65. 65. 65.]
[74. 74. 74. 74. 74. 74. 74. 74. 74. 74.]
[81. 81. 81. 81. 81. 81. 81. 81. 81. 81.]
[85. 85. 85. 85. 85. 85. 85. 85. 85. 85.]
[88. 88. 88. 88. 88. 88. 88. 88. 88. 88.]]
Parallel process:
Warning : Saving a partitioned mesh in a format older than 4.0 may cause information loss
PID = 1; 197 elements on processor 1 of 10
PID = 2; 241 elements on processor 2 of 10
PID = 6; 185 elements on processor 6 of 10
PID = 7; 194 elements on processor 7 of 10
PID = 3; 207 elements on processor 3 of 10
PID = 4; 194 elements on processor 4 of 10
PID = 5; 199 elements on processor 5 of 10
PID = 8; 256 elements on processor 8 of 10
PID = 9; 196 elements on processor 9 of 10
PID = 0; 223 elements on processor 0 of 10
[[43. 48. 46. 55. 52. 62. 72. 70. 79. 44.]
[42. 47. 45. 52. 62. 61. 70. 69. 77. 44.]
[42. 47. 55. 52. 64. 60. 72. 79. 79. 40.]
[37. 41. 39. 47. 58. 55. 67. 76. 81. 40.]
[39. 42. 42. 48. 48. 54. 67. 76. 83. 41.]
[44. 51. 49. 57. 69. 66. 76. 83. 83. 64.]
[60. 72. 80. 80. 87. 84. 90. 88. 91. 79.]
[85. 84. 89. 89. 92. 92. 91. 93. 93. 79.]
[86. 84. 90. 89. 92. 92. 92. 94. 93. 83.]
[88. 86. 88. 92. 90. 91. 93. 91. 92. 93.]]
done
'''
Without being much of an expert, I did notice that warning. An
older discussion seemed to describe what could be the same problem, but my current Gmsh version (4.11.1)
has been documented as Fipy-compatible (>=4.5.2) in 2021. Trilinos seems to be working as it should, and although the example uses the in-built "GmshGrid3D" class, I had the same results with a ".geo_unrolled" file.
For information, my environment was created via the provided "conda-trilinos-lock.yml" and looks like so:
'''
python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) | [GCC 9.4.0]
fipy 3.4.4+9.g9239820.dirty
numpy 1.21.6
pysparse not installed
scipy 1.7.3
matplotlib 3.5.3
mpi4py 3.1.3
petsc4py not installed
pyamgx not installed
PyTrilinos Trilinos version: 12.18.1
PyTrilinos version: 12.13
mayavi 4.8.1
gmsh 4.11.1
solver no-pysparse
'''
I would appreciate any suggestions that you may be able to offer – if you have any questions or would like any further information, just let me know and I'll respond as soon as possible.
Kind regards,
Ed