Error with Petsc and and multithreading?

128 views
Skip to first unread message

Konrad

unread,
Sep 7, 2019, 9:46:20 AM9/7/19
to deal.II User Group
Dear deal.ii community,

I am encountering an error when running the following program (only relevant parts):

class NedRTStd
{
public:
struct Parameters
{
              // Some parameters go here
};
NedRTStd () = delete;
NedRTStd (Parameters &parameters);
~NedRTStd ();

void run ();

private:
void setup_grid ();
void setup_system_matrix ();
void setup_constraints ();
void assemble_system ();
void solve_direct ();
void solve_iterative ();
void solve_iterative_preconditioned ();
void output_results () const;

MPI_Comm mpi_communicator;

Parameters &parameters;

parallel::distributed::Triangulation<3> triangulation;

FESystem<3>        fe; // Will hold 2 blocks (Nedelec-Raviart-Thomas pair)
DoFHandler<3>      dof_handler;

std::vector<IndexSet> owned_partitioning;
std::vector<IndexSet> relevant_partitioning;

AffineConstraints<double> constraints;

LA::MPI::BlockSparseMatrix system_matrix;

LA::MPI::BlockVector        locally_relevant_solution;
LA::MPI::BlockVector        system_rhs;

ConditionalOStream pcout;
TimerOutput        computing_timer;
};


NedRTStd::NedRTStd (Parameters &parameters_)
:
mpi_communicator(MPI_COMM_WORLD),
parameters(parameters_),
triangulation(mpi_communicator,
  typename Triangulation<3>::MeshSmoothing(
Triangulation<3>::smoothing_on_refinement |
Triangulation<3>::smoothing_on_coarsening)),
fe (FE_Nedelec<3>(1), 1,
FE_RaviartThomas<3>(1), 1),
dof_handler (triangulation),
pcout(std::cout, (Utilities::MPI::this_mpi_process(MPI_COMM_WORLD) == 0)),
computing_timer(mpi_communicator,
                    pcout,
                    TimerOutput::summary,
                    TimerOutput::wall_times)
{
}


void NedRTStd::setup_system_matrix ()
{
TimerOutput::Scope t(computing_timer, "system and constraint setup");

dof_handler.distribute_dofs (fe);

if (parameters.renumber_dofs)
{
DoFRenumbering::Cuthill_McKee (dof_handler);
}

DoFRenumbering::block_wise (dof_handler);

        //////////////////////////////////////
        // fe has 2 blocks and 6 components
        //////////////////////////////////////
std::vector<types::global_dof_index> dofs_per_block (2);
DoFTools::count_dofs_per_block (dof_handler, dofs_per_block);
const unsigned int n_sigma = dofs_per_block[0],
   n_u = dofs_per_block[1];

pcout << "Number of active cells: "
<< triangulation.n_global_active_cells()
<< " (on " << triangulation.n_levels() << " levels)"
<< std::endl
<< "Number of degrees of freedom: "
<< dof_handler.n_dofs()
<< " (" << n_sigma << '+' << n_u << ')'
<< std::endl;

owned_partitioning.resize(2);
owned_partitioning[0] = dof_handler.locally_owned_dofs().get_view(0, n_sigma);
owned_partitioning[1] = dof_handler.locally_owned_dofs().get_view(n_sigma, n_sigma + n_u);

IndexSet locally_relevant_dofs;
DoFTools::extract_locally_relevant_dofs(dof_handler, locally_relevant_dofs);
relevant_partitioning.resize(2);
relevant_partitioning[0] = locally_relevant_dofs.get_view(0, n_sigma);
relevant_partitioning[1] = locally_relevant_dofs.get_view(n_sigma, n_sigma + n_u);

setup_constraints (); // Only hanging nodes here

{
BlockDynamicSparsityPattern dsp(dofs_per_block, dofs_per_block);

DoFTools::make_sparsity_pattern( dof_handler, dsp, constraints, false);
SparsityTools::distribute_sparsity_pattern(dsp,
dof_handler.locally_owned_dofs_per_processor(),
mpi_communicator,
locally_relevant_dofs);

system_matrix.clear();
system_matrix.reinit(owned_partitioning, dsp, mpi_communicator);
}

        //////////////////////////////////////
        // Here something happens...
        //////////////////////////////////////
locally_relevant_solution.reinit(owned_partitioning,
relevant_partitioning,
mpi_communicator);
        //////////////////////////////////////
        // Here also
        //////////////////////////////////////
system_rhs.reinit(owned_partitioning, mpi_communicator);
}

I get the following exception:

--------------------------------------------------------
An error occurred in line <122> of file </home/ks/lib_compile/dealii-9.1.1/source/lac/petsc_vector_base.cc> in function
   dealii::PETScWrappers::VectorBase::VectorBase()
The violated condition was:  
   MultithreadInfo::is_running_single_threaded()
Additional information:  
   PETSc does not support multi-threaded access, set the thread limit to 1 in MPI_InitFinalize().

.
.
.
--------------------------------------------------------

I don't have a clue what happened here since in tutorial step-55 a similar thing works (I can not tell what I am doing wrong). I do not invoke any threading deliberately.

Sorry for posting this if inappropriate - you are of course not responsible for debugging my code - but any hint would be appreciated.

Best,
Konrad

Konrad

unread,
Sep 9, 2019, 4:47:12 AM9/9/19
to deal.II User Group
OK, just to answer the question: If you are running with petsc you should not call

dealii::Utilities::MPI::MPI_InitFinalize
mpi_initialization(argc, argv, dealii::numbers::invalid_unsigned_int);

since this will invoke threading if you start a number of mpi processes that is not equal the number of cores you use. Instead one should pass (when using generic linear algebra)

#ifdef USE_PETSC_LA
dealii::Utilities::MPI::MPI_InitFinalize
mpi_initialization(argc, argv, /* disable threading for petsc */ 1);
#else
dealii::Utilities::MPI::MPI_InitFinalize
mpi_initialization(argc, argv, dealii::numbers::invalid_unsigned_int);
#endif

I guess that was the bug.

Best,
Konrad

Wolfgang Bangerth

unread,
Sep 9, 2019, 10:02:40 AM9/9/19
to dea...@googlegroups.com
Yes, this is indeed the case, and is what the error message was trying to
suggest. If you think that the error message could have been clearer, feel
free to propose a better wording and we'll include it for the next release!

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Konrad

unread,
Sep 9, 2019, 5:10:38 PM9/9/19
to deal.II User Group
Yes, this is indeed the case, and is what the error message was trying to
suggest. If you think that the error message could have been clearer, feel
free to propose a better wording and we'll include it for the next release!

Thanks a lot, Wolfgang! The error message is actually quite clear. What I found out in addition is that it is advantageous and helpful to read the documentation ... ;-)

Best ,
Konrad

Wolfgang Bangerth

unread,
Sep 9, 2019, 5:31:32 PM9/9/19
to dea...@googlegroups.com
On 9/9/19 3:10 PM, Konrad wrote:
>
> Thanks a lot, Wolfgang! The error message is actually quite clear. What
> I found out in addition is that it is advantageous and helpful to read
> the documentation ... ;-)

Yes, no doubt :-) We're spending a lot of time writing it ;-)

Cheers
Reply all
Reply to author
Forward
0 new messages