Large Memory Consumption making BlockDynamicSparsityPattern

Matteo Frigo

unread,

Oct 4, 2019, 9:57:24 PM10/4/19

to deal.II User Group

Hello,

I'm currently working on the upgrading of my code, adding PETSc as an alternative to Trilinos for the Linear Algebra package.
I'm implementing this option following Tutorial 55.
However, I'm dealing with some issues when I try to run massive parallel simulations.
Especially large memory consumption occurs in the system setup phase.
After some debugging, I was able to figure out that the part of the code responsible for this is the generation of the sparsity pattern, i.e., the following rows:

BlockDynamicSparsityPattern dsp(local_partitioning);
DoFTools::make_sparsity_pattern(dof_handler, scratch_coupling, dsp, constraints, false, this_mpi_process);

I wanted to point out that this behavior doesn't depend on PETSc, but it is related only with the procedure wherewith we make the Block Sparsity Pattern (BSP). Indeed I ran into the same issue with Trilinos
if the above strategy is selected.

In the previous version of the code, I used these rows to generate the BSP:

TrilinosWrappers::BlockSparsityPattern sp(local_partitioning,MPI_COMM_WORLD);
DoFTools::make_sparsity_pattern(dof_handler, matrix_coupling,
sp, constraints, false, this_mpi_process);
sp.compress();

In this last case, the amount of memory required to generate the BSP is much less respect with the first case.
Any ideas what is going on? Am I doing wrong something?

Thank you very much for your support.

Matteo

Wolfgang Bangerth

unread,

Oct 6, 2019, 2:19:51 PM10/6/19

to dea...@googlegroups.com

Matteo,

> Especially large memory consumption occurs in the system setup phase.
> After some debugging, I was able to figure out that the part of the code
> responsible for this is the generation of the sparsity pattern, i.e., the
> following rows:
>
> BlockDynamicSparsityPattern dsp(local_partitioning);
> DoFTools::make_sparsity_pattern(dof_handler, scratch_coupling, dsp,
> constraints, false, this_mpi_process);
>
> I wanted to point out that this behavior doesn't depend on PETSc, but it is
> related only with the procedure wherewith we make the Block Sparsity Pattern
> (BSP). Indeed I ran into the same issue with Trilinos
> if the above strategy is selected.
>
> In the previous version of the code, I used these rows to generate the BSP:
>
> TrilinosWrappers::BlockSparsityPattern sp(local_partitioning,MPI_COMM_WORLD);
> DoFTools::make_sparsity_pattern(dof_handler, matrix_coupling,
> sp, constraints, false, this_mpi_process);
> sp.compress();
>
> In this last case, the amount of memory required to generate the BSP is much
> less respect with the first case.
> Any ideas what is going on? Am I doing wrong something?

What is the precise difference? Just the existence of the communicator
argument when creating the sparsity object? Or are you saying that creating a
DynamicSparsityPattern
is expensive whereas creating a
TrilinosWrappers::SparsityPattern
is not? If so, do you think you could come up with a small testcase that
really just does that and nothing else, and that illustrates the problem?

As a side note, how do you determine memory consumption?

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Matteo Frigo

unread,

Oct 9, 2019, 12:26:23 PM10/9/19

to dea...@googlegroups.com

Thank you very much for answering.
I'm saying that making a DynamicSparsityPattern by using the procedure described above is more expensive (from the memory point of view) respect using TrilinosWrappers::BlockSparsityPattern.
I noted this trouble trying to run some test cases with a large number of unknowns (100 millions of dofs).
In such cases, I get an out of memory error if I use DynamicSparsityPattern, whereas it works fine using TrilinosWrappers::BlockSparsityPattern.
Investigating on smaller cases, I noted that a peak of memory usage occurs during the call of the function:

DoFTools::make_sparsity_pattern(dof_handler, scratch_coupling, dsp, constraints, false, this_mpi_process);

It means that the problem remains, even if the program runs until the end.
As far as the analysis of memory consumption is concerned, I used the Massif tool from Valgrind.

Matteo

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/62227886-8fd1-46f5-ab55-6161b5892878%40colostate.edu.

Wolfgang Bangerth

unread,

Oct 11, 2019, 1:41:51 AM10/11/19

to dea...@googlegroups.com

On 10/9/19 10:26 AM, Matteo Frigo wrote:
>
> I'm saying that making a DynamicSparsityPattern by using the procedure
> described above is more expensive (from the memory point of view) respect
> using TrilinosWrappers::BlockSparsityPattern.
> I noted this trouble trying to run some test cases with a large number of
> unknowns (100 millions of dofs).
> In such cases, I get an out of memory error if I use DynamicSparsityPattern,
> whereas it works fine using TrilinosWrappers::BlockSparsityPattern.
> Investigating on smaller cases, I noted that a peak of memory usage occurs
> during the call of the function:
> DoFTools::make_sparsity_pattern(dof_handler, scratch_coupling, dsp,
> constraints, false, this_mpi_process);
> It means that the problem remains, even if the program runs until the end.
> As far as the analysis of memory consumption is concerned, I used the Massif
> tool from Valgrind.

Matteo,
thanks for clarifying. It would still be really nice if you could create a
small program that really just builds a mesh, a DoFhandler, and then the
sparsity pattern with both ways -- this would make it easier for us to figure
out what is going on!

Matteo Frigo

unread,

Oct 11, 2019, 1:01:18 PM10/11/19

to dea...@googlegroups.com

I performed some tests with the tutorial 55. I noted that the some issue shows up here, too.

Especially it occurs when I try solving the 3D problem after 5 refinement cycles.

Changing the following part of code :

BlockDynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);
DoFTools::make_sparsity_pattern(dof_handler, coupling, dsp, constraints, false);

SparsityTools::distribute_sparsity_pattern(
dsp,
dof_handler.compute_locally_owned_dofs_per_processor(),
mpi_communicator,
locally_relevant_dofs);
system_matrix.reinit(owned_partitioning, dsp, mpi_communicator);
with this one:
TrilinosWrappers::BlockSparsityPattern sp(owned_partitioning,MPI_COMM_WORLD);
DoFTools::make_sparsity_pattern(dof_handler,
coupling,
sp,
constraints,
false,
Utilities::MPI::this_mpi_process(mpi_communicator));
sp.compress();
system_matrix.reinit(sp);

everything works fine.

Obviously I changed the same part for making the preconditioner sparsity pattern.

I think that this test case can help you to figure out what is going on.

Thanks,

Matteo

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/9a02c6b1-3ce2-02b7-6db8-0c05e4102702%40colostate.edu.

Matteo Frigo

unread,

Oct 21, 2019, 4:20:16 PM10/21/19

to dea...@googlegroups.com

Dear Wolfgang,

I have figured out what the trouble is. I notice that the dimensions stored in "dofs_per_block" are the global number of dof per each block. Hence, the constructor "BlockDynamicSparsityPattern dsp" allocates for each block a std::vector<line> with the size equal to the global dof number of the respective block.

Since each MPI process performs this allocation, it can cause an out of memory error if the code runs with many MPI processes.

Thank you very much for your time and for your help.

Best,

Matteo

Wolfgang Bangerth

unread,

Oct 21, 2019, 6:57:50 PM10/21/19

to dea...@googlegroups.com

On 10/21/19 2:20 PM, Matteo Frigo wrote:
>
> I have figured out what the trouble is. I notice that the dimensions
> stored in "dofs_per_block" are the global number of dof per each block.
> Hence, the constructor "BlockDynamicSparsityPattern dsp" allocates for
> each block a std::vector<line> with the size equal to the global dof
> number of the respective block.

Oh, doh, yes. I had looked at your code for a minute and thought about
that line, but didn't see the problem that you're not writing
DynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);
but
BlockDynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);

Indeed, the constructor arguments should of course be the number of
blocks, not the number of DoFs!

Reply all

Reply to author

Forward