Parallelizing step-33 with MPI

Ellen M. Price

unread,

Dec 11, 2019, 4:53:23 PM12/11/19

to deal.II User Group

Hi deal.II users and developers,

I just started with deal.II and am trying out a sample problem similar to the one I want to solve. I am trying to use step-33 for conservation equations and migrate parts of step-40 for the MPI capabilities I need. I seem to be stuck, however, when I am assembling the system. When I read from the "current_solution" or "old_solution" variables, I just get zeros, which lead to nan values in the computation. I'm too new to this software to know exactly where I've gone wrong, but I suspect I'm reading from the wrong locations or need to distribute the vector differently? As a concrete example, when I run in a debugger, the following lines

std::vector<Sacado::Fad::DFad<double>>
    independent_local_dof_values(dofs_per_cell);

for (unsigned int i = 0; i < dofs_per_cell; ++i)
    independent_local_dof_values[i] = current_solution(local_dof_indices[i]);

produce a vector of zero values, even though the initial condition is certainly not all zeros (based on VTK output and the input deck). Any help is appreciated; I can post other parts of the program that are modified, even a diff if that's helpful.

Ellen Price

Wolfgang Bangerth

unread,

Dec 16, 2019, 7:03:05 PM12/16/19

to dea...@googlegroups.com

Ellen,

> I just started with deal.II and am trying out a sample problem similar to the
> one I want to solve. I am trying to use step-33 for conservation equations and
> migrate parts of step-40 for the MPI capabilities I need. I seem to be stuck,
> however, when I am assembling the system. When I read from the
> "current_solution" or "old_solution" variables, I just get zeros, which lead
> to nan values in the computation. I'm too new to this software to know exactly
> where I've gone wrong, but I suspect I'm reading from the wrong locations or
> need to distribute the vector differently?

That's possible. First, are you running in debug mode? Second, are *all*
values you get zero? If that's the case, then I suspect that you just didn't
initialize the vector correctly: nothing nonzero was ever put into it
apparently. If you get zeros for only some entries, then it may be that that
only happens on ghost entries of the vector, and that these need to be
correctly set. But it's hard to tell for sure without knowing more.

> As a concrete example, when I run
> in a debugger, the following lines
>
> |
> std::vector<Sacado::Fad::DFad<double>>
> independent_local_dof_values(dofs_per_cell);
>

> for(unsignedinti =0;i <dofs_per_cell;++i)
> independent_local_dof_values[i]=current_solution(local_dof_indices[i]);

> |
>
> produce a vector of zero values, even though the initial condition is
> certainly not all zeros (based on VTK output and the input deck).

So what if you compare the concrete value of independent_local_dof_values[i]
and current_solution(local_dof_indices[i])? Are both of them zero? If yes,
then current_solution was apparently already wrong.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Ellen M. Price

unread,

Dec 20, 2019, 8:42:44 PM12/20/19

to deal.II User Group

I did eventually solve this problem, but I ran into another one. I think I'm close, I just need a little guidance. The problems occur in assembling the system. I've checked the right-hand side vector and it has some odd features. I'm running the "slide" problem, just like step-33 does. I suppose my question is, how should the system assembly routine be changed to accommodate parallelization? All I did was add an if statement to check whether a cell is locally owned before computing on it. Conceptually, what else needs to change?

Happy to post the modified code if it helps.

Ellen Price

Wolfgang Bangerth

unread,

Dec 20, 2019, 11:24:23 PM12/20/19

to dea...@googlegroups.com

On 12/20/19 6:42 PM, Ellen M. Price wrote:
> I did eventually solve this problem, but I ran into another one. I think I'm
> close, I just need a little guidance. The problems occur in assembling the
> system. I've checked the right-hand side vector and it has some odd features.
> I'm running the "slide" problem, just like step-33 does. I suppose my question
> is, how should the system assembly routine be changed to accommodate
> parallelization? All I did was add an if statement to check whether a cell is
> locally owned before computing on it. Conceptually, what else needs to change?

Ellen,
not actually very much: You need a non-ghosted (writable) vector, which
apparently you have, for your right hand side. Then you loop over the locally
owned cells (which you do), and the only other thing you have to do is to call
compress() at the end of the assembly.

There are many other things one can do to test whether things work correctly,
but "has some odd features" is not enough of a hint to allow for more concrete
suggestions :-)

Best
WB

Ellen M. Price

unread,

Dec 21, 2019, 2:52:17 PM12/21/19

to deal.II User Group

Thanks for the feedback. This problem was me writing to the wrong matrix elements, and I'm so close to having a working prototype of this code. I think the (hopefully last) bug is in this section of code:

dof_handler.clear();
dof_handler.distribute_dofs(fe);
locally_owned_dofs = dof_handler.locally_owned_dofs();
DoFTools::extract_locally_relevant_dofs(dof_handler, locally_relevant_dofs);

DynamicSparsityPattern dsp(locally_relevant_dofs);
DoFTools::make_sparsity_pattern(dof_handler, dsp);
SparsityTools::distribute_sparsity_pattern(
    dsp, dof_handler.n_locally_owned_dofs_per_processor(),
    MPI_COMM_WORLD, locally_relevant_dofs);

system_matrix.reinit(locally_owned_dofs, locally_owned_dofs, 
    dsp, MPI_COMM_WORLD);

current_solution.reinit(locally_owned_dofs, 
    locally_relevant_dofs, MPI_COMM_WORLD);
right_hand_side.reinit(locally_owned_dofs, MPI_COMM_WORLD);

I get a runtime exception from the TrilinosWrappers module when I call SparseMatrix::add() on more than one processor, and I suspect that I've set up the sparsity pattern incorrectly (maybe the system_matrix doesn't know how to distribute entries, or it gets the wrong indices?). Is there anything obviously wrong with what I've done here? There are a lot of notes in the doxygen about setting up the Trilinos sparsity pattern, but nothing I've tried (like calling compress() after setting up the matrix) works.

Ellen

luca.heltai

unread,

Dec 22, 2019, 3:29:58 AM12/22/19

to Deal.II Users

Dear Ellen,

you may want to compare with this:

https://github.com/luca-heltai/dealii/pull/91/files#diff-acfb3c7b43e4935be016fda6ebdc5881

It is a parallel version of step-38, that never got into the library, written by one of our students, during a deal.II workshop in Trieste (held by Timo and myself).

Best,
Luca.

> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/9b404d3e-cd78-4290-b4d0-11b36edcc692%40googlegroups.com.

Reply all

Reply to author

Forward