Questions regarding parallel::shared::Triangulation and partitioning

54 views
Skip to first unread message

Nihar Bhardwaj Darbhamulla

unread,
Dec 2, 2025, 10:47:41 PM12/2/25
to deal.II User Group
I have a parallel::shared::Triangulation, which due to load balancing reasons, I have had to partition as shown below.

Screenshot from 2025-12-02 19-30-32.png

The reason for this partitioning is that the mesh undergoes computation in phases. Given this partitioning, I am attempting to renumber dofs first by Cutthill McKee followed by block wise renumbering. However on doing either operation, my renumbering gets skewed with number of degrees of freedom far exceeding the balance. I have attached the output of locally_owned_dofs() below from each partition before and after renumbering. The number of dofs associated with each block also appear to shuffle around. In this case, what would be a viable way forward since my objective is to construct and use block preconditioners for my problem. Screenshot from 2025-12-02 19-47-13.png


Thanks,
Nihar

Wolfgang Bangerth

unread,
Dec 5, 2025, 12:40:23 PM12/5/25
to dea...@googlegroups.com
On 12/2/25 20:47, Nihar Bhardwaj Darbhamulla wrote:
>
> The reason for this partitioning is that the mesh undergoes computation in
> phases. Given this partitioning, I am attempting to renumber dofs first by
> Cutthill McKee followed by block wise renumbering. However on doing either
> operation, my renumbering gets skewed with number of degrees of freedom far
> exceeding the balance. I have attached the output of locally_owned_dofs()
> below from each partition before and after renumbering. The number of dofs
> associated with each block also appear to shuffle around. In this case, what
> would be a viable way forward since my objective is to construct and use block
> preconditioners for my problem.

Nihar:
I'm not entirely sure I understand what you see. It would probably help if you
created a small test case that showed how you ended up with the problem.

In any case, if I interpret things right, then you partition the mesh so that
the two halves have roughly equal number of cells. That's how it should be. Do
you have different numbers of degrees of freedom on cells, via the hp
framework? If so, you may of course get different numbers of DoFs on each
partition -- just because the number of cells in each partition is balanced
does not mean that the number of DoFs is balanced if cells have different
numbers of local DoFs. If that's not the case: How do you calculate the number
of DoFs owned by each partition?

Best
W.

Nihar Bhardwaj Darbhamulla

unread,
Dec 16, 2025, 1:40:59 AM12/16/25
to deal.II User Group
Hello Prof. Wolfgang,

Please find attached here a minimal example to illustrate what I've tried to do. 

I have different PDEs governing different parts of the domain, and I have used the hp framework to create a collection of FESystem finite elements which will ensure a stable set is chosen. As an example, I have the Q2-Q1 elements on one part of the domain (22 dofs per cell in 2D - domain 1) and Q1 elements on one part of the domain (9 dofs per cell in 2D - domain 2). This leads to different number of DOFS on different parts of the domain. 

When I partition my domain using METIS using the single constrained framework, I obtain an equal split of cells as expected across the processes. This is expected since all cells are equally weighted. Now, when I assign cell weights proportional to the degrees of freedom, I will end up with 9/31 of the processes having the cells of domain 1 and 22/31 of the processes having the cells of domain 2. In this scenario, during the linear solver phase, I will end up with extensive process idling. To mitigate this overhead, I tried to setup a multiconstraint partitioning through METIS, by weighing each cell by the phase of computation during which it will be active. This leads to some cells of domain 1 and domain 2 ending up on the same process. But now, I will effectively mitigate some part of process idling. The partitioning with METIS under single (right) and multi-constraints (left) are shown in the images attached.

The issue now arises when I try to renumber my degrees of freedom in a block wise manner. Right after the multi-constraint partitioning, I can see that the Degrees of freedom each process has to deal with are nearly identical, but I'm not sure that renumbering is maintaining consistency with the way it is meant to perform. I have attached the code below along with sample output. The code is built with deal.II/9.6.2, openmpi/4.1.1, and gcc/13.3. The code can be built using cmake and can be run using

mpirun -n <np> MultiConstraint <n_grid> 

np is the number of processes and n_grid is the number of cells along the x-direction. The code will generate output to the screen with details containing the partitioning information as well as the dof information before and after renumbering. Furthermore, pvtu and vtu files are written illustrating the single constraint partitioning (Grid_BasePart_<n_grid>) and multi-constraint partitioning (Grid_MCPart_<ngrid>). Please let me know if I should add further details regarding my build. 

Thank you once again.
Regards,
Nihar
CMakeLists.txt
DoFInfoRenumbering.jpg
Partition_Comparison.jpeg
MultiConstraint.cc

Nihar Bhardwaj Darbhamulla

unread,
Dec 16, 2025, 4:05:26 PM12/16/25
to deal.II User Group
Hello Prof. Wolfgang,

I've tried two things with single constraint partitioning:
  1. Connect triangulation.signals.weight in the constructor of my class, before the GridGenerator is called. In this case, the partitioning balances the cells based on the weighting function, and only works when the weighting function does not depend on the material_id (makes sense, as we don't set material ids while generating the Grid). 
  2. Accept the first partitioning as is (an even split across p processes), and connect triangulation.signals.weight after GridGenerator is called, and then call GridTools::partition_triangulation. While this assigns the subdomain_ids to cells correctly, based on the partitioners output, the active_cell_indices are not updated, which results in the dof handler showing unexpected behaviour.
Therefore, I believe the problem simplifies to how to repartition and update the active cells in the new partition in the case of parallel::shared::Triangulation.

Thanks and Regards,
Nihar

Wolfgang Bangerth

unread,
Dec 17, 2025, 11:29:32 PM12/17/25
to dea...@googlegroups.com
On 12/16/25 14:05, Nihar Bhardwaj Darbhamulla wrote:
>
> I've tried two things with single constraint partitioning:
>
> 1. Connect triangulation.signals.weight in the constructor of my class,
> before the GridGenerator is called. In this case, the partitioning
> balances the cells based on the weighting function, and only works when
> the weighting function does not depend on the material_id (makes sense, as
> we don't set material ids while generating the Grid).
> 2. Accept the first partitioning as is (an even split across p processes),
> and connect triangulation.signals.weight after GridGenerator is called,
> and then call GridTools::partition_triangulation. While this assigns the
> subdomain_ids to cells correctly, based on the partitioners output, the
> active_cell_indices are not updated, which results in the dof handler
> showing unexpected behaviour.
>
> Therefore, I believe the problem simplifies to how to repartition and update
> the active cells in the new partition in the case of
> parallel::shared::Triangulation.

Nihar -- I have tried to understand what it is you want to do, but have to
admit that it is difficult to understand someone else's 600+ lines of code :-(

If I understand you right, then the problem is in 2. above. But I'm not clear
what exactly the problem is. For parallel::shared::Triangulation, then mesh
remains unchanged during partitioning, and consequently active_cell_indices
should remain unchanged. The only thing partitioning does is to assign a
subdomain_id to each cell. Based on the picture you show, this seems to be
working as expected. (I assume that your domain is split into an upper and a
lower half with regards to the physics, and that the left picture shows your
multiconstraint partitioning in which every one of the four processes has a
part of the two physics domains. If these assumptions are right, then the
picture looks reasonable.)

So then the remaining question is "which results in the dof handler showing
unexpected behaviour." Here, I'm out of my depth: Can you be more specific
about what it is you expect, what it is you see, and how these differ?

Best
WB
Reply all
Reply to author
Forward
0 new messages