Only while using MPI : debug build crashing & release build running

111 views
Skip to first unread message

Sudip Kunda

unread,
Jun 11, 2025, 8:09:52 AM6/11/25
to deal.II User Group
Hello all,

I've encountered a potential issue while compiling and running the deal.II code gallery example titled "Quasi-Static Finite-Strain Quasi-Incompressible Viscoelasticity" from  here. Specifically, the release build runs correctly with MPI, but the debug build crashes when run with MPI. When run in serial (without MPI), both the debug and release builds run as expected, producing identical screen output.

Steps to reproduce :
- Download files attached to this message
- Run "cmake ." in the folder to which files were downloaded
- Run commands "make release" then "make" then "mpirun -np 2 ./viscoelastic_strip_with_hole". Program will run as expected.
- Run commands "make debug" then "make" then "mpirun -np 2 ./viscoelastic_strip_with_hole". Program will crash.

Library versions :
- deal.II : 9.6.2
- openmpi : 4.1.0
- Trilinos : 16.0.0
- metis : Version included with petsc 3.22.3
- OS : Ubuntu 24.04 LTS

From the message displayed by deal.II upon crashing : The program crashes in the output_results() function upon encountering the first call to DataOut::add_data_vector because the block vector being added to the DataOut object is has non-contiguous blocks. Why is this a problem for the debug build in MPI but not the release build with MPI ? The program asks the user to take a look at steps 32 and 55 for how component wise DOF renumbering was used, but I was not able to figure out the issue after reading through these examples. 

Thanks,
Sudip Kunda

parameters.prm
CMakeLists.txt
viscoelastic_strip_with_hole.cc

Wolfgang Bangerth

unread,
Jun 26, 2025, 11:36:55 PM6/26/25
to dea...@googlegroups.com

Sudip:

> I've encountered a potential issue while compiling and running the deal.II
> code gallery example titled /"Quasi-Static Finite-Strain Quasi-Incompressible
> Viscoelasticity"/ from here
> <https://dealii.org/developer/doxygen/deal.II/code_gallery_Quasi_static_Finite_strain_Quasi_incompressible_ViscoElasticity.html>. Specifically, the release build runs correctly with MPI, but the debug build crashes when run with MPI. When run in serial (without MPI), both the debug and release builds run as expected, producing identical screen output.
>
> Steps to reproduce :
> - Download files attached to this message
> - Run "cmake ." in the folder to which files were downloaded
> - Run commands "make release" then "make" then "mpirun -np 2
> ./viscoelastic_strip_with_hole". Program will run as expected.
> - Run commands "make debug" then "make" then "mpirun -np 2
> ./viscoelastic_strip_with_hole". Program will crash.
>
> Library versions :
> - deal.II : 9.6.2
> - openmpi : 4.1.0
> - Trilinos : 16.0.0
> - metis : Version included with petsc 3.22.3
> - OS : Ubuntu 24.04 LTS
>
> From the message displayed by deal.II upon crashing : The program crashes in
> the output_results() function upon encountering the first call to
> DataOut::add_data_vector because the block vector being added to the DataOut
> object is has non-contiguous blocks. Why is this a problem for the debug build
> in MPI but not the release build with MPI ?

It is also a problem in release mode, it's just that there is no *check* to
catch the problem, and consequently the program continues without producing
the error. In all likelihood, the results are then wrong.

I spent the last hour looking at this program, and I think what happened is
that it *used* to be possible to partition a ::Triangulation object by hand
and have the DoFHandler do the right thing, but that that is no longer the
case (likely for good reasons). I verified that the underlying reason for the
crash you see can be fixed by using parallel::shared::Triangulation instead,
which is in essence what the program tried to emulate. It would be interesting
to know when (=with which deal.II release) this stopped working, if it ever
worked. Either way, a patch is here:
https://github.com/dealii/code-gallery/pull/217

Best
Wolfgang


--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/


Wolfgang Bangerth

unread,
Jul 2, 2025, 12:32:59 PM7/2/25
to deal.II User Group
Sudip:
I had to fix another issue, but the currently posted patch works for one and two processes. It would be nice if you could try and make sure it also works for your purposes (with however many processes you want to use).
Best
 W.

Sudip Kunda

unread,
Jul 30, 2025, 6:41:33 PM7/30/25
to deal.II User Group
Professor Bangerth,

This code, with parallel::shared::Triangulation instead of manual decomposition of the triangulation, is now working as expected with multiple processors in debug mode on my computer. Thank you for your help.

Best Regards,
Sudip Kunda

Wolfgang Bangerth

unread,
Jul 30, 2025, 6:45:01 PM7/30/25
to dea...@googlegroups.com
On 7/30/25 16:41, Sudip Kunda wrote:
>
> This code, with parallel::shared::Triangulation instead of manual
> decomposition of the triangulation, is now working as expected with multiple
> processors in debug mode on my computer. Thank you for your help.

OK, that's nice to know -- thank you for double-checking!

Best
W.
Reply all
Reply to author
Forward
0 new messages