Is it normal for the results to vary this much when running in parallel.?

109 views
Skip to first unread message

Abbas Ballout

unread,
Jun 12, 2023, 4:57:03 AM6/12/23
to deal.II User Group
I am running step 18.  

This is the output I getting for a single quasi-tatic step with mpirun 1 : 

  Cycle 0:
    Number of active cells:       3712 (by partition: 3712)
    Number of degrees of freedom: 17226 (by partition: 17226)
    Assembling system... norm of rhs is 1.88062e+10
    Solver converged in 103 iterations.
    Updating quadrature point data...
  Cycle 1:
    Number of active cells:       12805 (by partition: 12805)
    Number of degrees of freedom: 51708 (by partition: 51708)
    Assembling system... norm of rhs is 1.86145e+10
    Solver converged in 120 iterations.
    Updating quadrature point data...
    Moving mesh..
.

 And this is the output I get when mpirun 3: 

Timestep 1 at time 1
  Cycle 0:
    Number of active cells:       3712 (by partition: 1360+1286+1066)
    Number of degrees of freedom: 17226 (by partition: 6651+5922+4653)
    Assembling system... norm of rhs is 1.88062e+10
    Solver converged in 131 iterations.
    Updating quadrature point data...
  Cycle 1:
    Number of active cells:       12805 (by partition: 4565+4425+3815)
    Number of degrees of freedom: 51708 (by partition: 19983+17250+14475)
    Assembling system... norm of rhs is 3.67161e+10
    Solver converged in 126 iterations.
    Updating quadrature point data...
    Moving mesh...


The L2 norm in cycle 1 is different between runs with mpi 1 and mpi 3. Is this normal? 

I am experiencing the same problem in my code and the results seem to be slightly mpi dependent.  

The code is attached bellow. It's step 18 the only difference is that I only am running a single step 
step-18.cc

Abbas Ballout

unread,
Jun 12, 2023, 5:59:05 AM6/12/23
to deal.II User Group
Additionally; 

I took step 40, changed the rhs term to be a constant 1 and added a  flux_calculation()
function which loops on the boundary and dots the normal with the gradient of the solution (akin to calculating forces).
I also added a GridTools::distort_random(0.3, triangulation, true, 500 or something);after mesh generation to get an unstructured mesh.  
I run it for a single cycle and there seem to be a discrepancy in the flux calculation for an unstructured grid when running with different cores. 
This doesn't happen with a structured grid. 

The output for the flux calculation on a structured mesh with MPI 1 is:
Running with PETSc on 1 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994394

and for MPI 4 is:
Running with PETSc on 4 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994394

But after activating distort_random the output with MPI 1 is:
Running with PETSc on 1 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 8 iterations.
Sigma_flux -0.993997

but with MPI 4 the output is:
Running with PETSc on 4 MPI rank(s)...
Cycle 0:
   Number of active cells:       64
   Number of degrees of freedom: 289
   Solved in 9 iterations.
Sigma_flux -0.994323

The file is also attached bellow. 

step-40.cc

Wolfgang Bangerth

unread,
Jun 16, 2023, 5:38:10 PM6/16/23
to dea...@googlegroups.com
On 6/12/23 03:59, Abbas Ballout wrote:
>
> But after activating distort_random the output with MPI 1 is:
> Running with PETSc on 1 MPI rank(s)...
> Cycle 0:
>    Number of active cells:       64
>    Number of degrees of freedom: 289
>    Solved in 8 iterations.
> Sigma_flux -0.993997
>
> but with MPI 4 the output is:
> Running with PETSc on 4 MPI rank(s)...
> Cycle 0:
>    Number of active cells:       64
>    Number of degrees of freedom: 289
>    Solved in 9 iterations.
> Sigma_flux-0.994323

Abbas,
I suspect that when you call distort_random(), the mesh is different whether
you run on one or four processes. Have you confirmed that the meshes are
exactly identical?

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/


Wolfgang Bangerth

unread,
Jun 16, 2023, 5:44:43 PM6/16/23
to dea...@googlegroups.com
On 6/12/23 02:57, Abbas Ballout wrote:
> **
>
> I am running step 18.
>
> This is the output I getting for a single quasi-tatic step with mpirun 1 :
>
> /Cycle 0:
>     Number of active cells:       3712 (by partition: 3712)
>     Number of degrees of freedom: 17226 (by partition: 17226)
>     Assembling system... norm of rhs is 1.88062e+10
>     Solver converged in 103 iterations.
>     Updating quadrature point data...
>   Cycle 1:
>     Number of active cells:       12805 (by partition: 12805)
>     Number of degrees of freedom: 51708 (by partition: 51708)
>     Assembling system... norm of rhs is 1.86145e+10
>     Solver converged in 120 iterations.
>     Updating quadrature point data...
>     Moving mesh../.
>
>  And this is the output I get when mpirun 3:
>
> /Timestep 1 at time 1
>   Cycle 0:
>     Number of active cells:       3712 (by partition: 1360+1286+1066)
>     Number of degrees of freedom: 17226 (by partition: 6651+5922+4653)
>     Assembling system... norm of rhs is 1.88062e+10
>     Solver converged in 131 iterations.
>     Updating quadrature point data...
>   Cycle 1:
>     Number of active cells:       12805 (by partition: 4565+4425+3815)
>     Number of degrees of freedom: 51708 (by partition: 19983+17250+14475)
>     Assembling system... norm of rhs is 3.67161e+10
>     Solver converged in 126 iterations.
>     Updating quadrature point data...
>     Moving mesh.../
>
> The L2 norm in cycle 1 is different between runs with mpi 1 and mpi 3. Is this
> normal?

This is an interesting observation. It strikes me as a bug.

Have you compared the solution you get after running either 1 or 3 MPI
processes? If you visualize the result, or compute some other indicator of the
solution, in which ways do they differ?

Abbas Ballout

unread,
Jun 17, 2023, 3:58:43 PM6/17/23
to deal.II User Group
  • About distort_random
"I suspect that when you call distort_random(), the mesh is different whether
you run on one or four processes. Have you confirmed that the meshes are
exactly identical?" 
You were right about this one. After importing a GMSH unstructured mesh the results are perfect and independent of 
the MPI processes used.

  •   About step 18
I plotted the stress in Paraview along a line for both MPI runs 1 and 3. The results matched. 
The MatrixTools::apply_boundary_values(boundary_values, system_matrix, tmp, system_rhs, false) 
function appears to be the culprit . 

With MPI 1 
rhs.l2() before calling apply_boundary_values 3032.65
rhs.l2() after calling apply_boundary_values 1.86145e+10  

with MPI 3 
rhs.l2() before calling apply_boundary_values 3032.65
rhs.l2() after calling apply_boundary_values 3.67161e+10

Is it because apply_boundary_values is doing something special to ghost nodes? 

Best,
Abbas 

Wolfgang Bangerth

unread,
Jun 21, 2023, 1:29:34 AM6/21/23
to dea...@googlegroups.com
On 6/17/23 13:58, Abbas Ballout wrote:
> I plotted the stress in Paraview along a line for both MPI runs 1 and 3. The
> results matched.

That's good news!


> TheMatrixTools::apply_boundary_values(boundary_values, system_matrix, tmp,
> system_rhs, false)
> function appears to be the culprit .
>
> With MPI 1
> rhs.l2() before calling apply_boundary_values 3032.65
> rhs.l2() after calling apply_boundary_values 1.86145e+10
>
> with MPI 3
> rhs.l2() before calling apply_boundary_values 3032.65
> rhs.l2() after calling apply_boundary_values 3.67161e+10
>
> Is it because apply_boundary_values isdoing something special to ghost nodes?

Perhaps. I don't know. But you can find out if you output the elements of
system_rhs that correspond to boundary nodes or constrained nodes or in fact
any other node. It shouldn't be an issue with ghost nodes because the rhs
vector has no ghost entries.

Abbas Ballout

unread,
Jun 24, 2023, 11:13:35 AM6/24/23
to deal.II User Group
Bellow I have plotted the rhs before and after applying boundary constraints when running with 1, 2 and 3 cores. 
The rhs looks different at the boundary with MPI 3 as opposed to MPIs 1 and 2. 

Also attached the .cc and cmake files for convenience.  
step-18.cc
MPI2.png
MPI3.png
MPI1.png
CMakeLists.txt

Wolfgang Bangerth

unread,
Jun 28, 2023, 10:17:55 AM6/28/23
to dea...@googlegroups.com
On 6/24/23 09:13, Abbas Ballout wrote:
> **
>
> Bellow I have plotted the rhs before and after applying boundary constraints
> when running with 1, 2 and 3 cores.
> The rhs looks different at the boundary with MPI 3 as opposed to MPIs 1 and 2.

Right. The question is whether it is *wrong*.

Abbas Ballout

unread,
Jun 29, 2023, 3:59:41 AM6/29/23
to deal.II User Group
This isn't a problem anymore.
At first I thought that what was happening with step 18 and with distort random were caused by the same issue.
I am applying BCs weakly so it's okay.  
Thanks for the help. 
Reply all
Reply to author
Forward
0 new messages