refine_mesh get different number of active cells on different processes with parallel::shared::Triangulation

47 views
Skip to first unread message

Yiyang Zhang

unread,
Oct 27, 2017, 3:33:24 PM10/27/17
to deal.II User Group
Hello, 

I am using parallel::shared::Triangulation, sometimes, but not always, I will get the following very strange behaviour when doing refine_mesh. 

This is the code.

int refine_count = 0;
int coarsen_count = 0;

for(auto cell = this->tria.begin_active(); cell != this->tria.end(); ++cell){
if(cell->refine_flag_set()) refine_count++;
if(cell->coarsen_flag_set()) coarsen_count++;
}

std::cout<<"Process No. "<<Utilities::MPI::this_mpi_process(this->mpi_communicator) <<". Refine Count = "<<refine_count<<std::endl;

std::cout<<"Process No. "<<Utilities::MPI::this_mpi_process(this->mpi_communicator) <<". Coarsen Count = "<<coarsen_count<<std::endl;

SolutionTransfer<dim> sol_trans(this->dof_handler);

this->tria.prepare_coarsening_and_refinement();
sol_trans.prepare_for_coarsening_and_refinement(locally_relevant_sln);

std::cout<<"Process No. "<<Utilities::MPI::this_mpi_process(this->mpi_communicator) <<". Pre_refine, n_active_cells = "<<this->tria.n_active_cells()<<std::endl;
this->run_mesh_refinement();
std::cout<<"Process No. "<<Utilities::MPI::this_mpi_process(this->mpi_communicator) <<". Post_refine, n_active_cells = "<<this->tria.n_active_cells()<<std::endl;


and I have the following output:

Process No. 1. Refine Count = 6144

Process No. 1. Coarsen Count = 1442

Process No. 0. Refine Count = 6144

Process No. 0. Coarsen Count = 1442

Process No. 1. Pre_refine, n_active_cells = 43772

Process No. 0. Pre_refine, n_active_cells = 43772

Process No. 0. Post_refine, n_active_cells = 61367

Process No. 1. Post_refine, n_active_cells = 61364


I am running with 2 MPI processes. It looks very weird that both processes have the same refine count and coarsen count, but end up with different number of active cells.

Given a parallel::shared::Triangulation mesh, if the two processes have different active cells, then the program cannot proceed...

Wolfgang Bangerth

unread,
Oct 27, 2017, 3:39:32 PM10/27/17
to dea...@googlegroups.com
On 10/27/2017 01:33 PM, Yiyang Zhang wrote:
>
> I am running with 2 MPI processes. It looks very weird that both
> processes have the same refine count and coarsen count, but end up with
> different number of active cells.
>
> Given a parallel::shared::Triangulation mesh, if the two processes have
> different active cells, then the program cannot proceed...

If this is really with a shared::Triangulation, then this is a problem.
How do you set the refine and coarsen flags? Are you setting them in
such a way that they are the same on all processors?

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/
Message has been deleted

Yiyang Zhang

unread,
Oct 27, 2017, 4:13:24 PM10/27/17
to deal.II User Group
Hello Prof. Bangerth,

Yes I think I am setting them in a way that is exactly same throughout all processes.
Since I can check the n_active_cells() before the refinement, and also the refine_flags and coarsen_flags for each process. They are the same for each process. But after the refinement, the number of active_cells suddenly becomes different.

I attached the refine_mesh() code.

Best,
Yiyang

Vector<float> estimated_error_per_cell(this->tria.n_active_cells());
KellyErrorEstimator<dim>::estimate(this->dof_handler,
  QGauss<dim - 1>(Q_POINTS),
  typename FunctionMap<dim>::type(),
  locally_relevant_sln,
  estimated_error_per_cell,
  ComponentMask(),
  0,
  MultithreadInfo::n_threads(),
  Utilities::MPI::this_mpi_process(this->mpi_communicator));

IndexSet local_active_cells(this->tria.n_active_cells());
local_active_cells.clear();
for(auto cell = this->tria.begin_active(); cell != this->tria.end(); ++cell){
if(cell->is_locally_owned()){
local_active_cells.add_index(cell->active_cell_index());
}
}
TrilinosWrappers::MPI::Vector distributed_error_per_cell(local_active_cells, this->mpi_communicator);
for(auto cell = this->tria.begin_active(); cell != this->tria.end(); ++cell){
if(cell->is_locally_owned()){
distributed_error_per_cell(cell->active_cell_index()) = estimated_error_per_cell(cell->active_cell_index());
}
}
distributed_error_per_cell.compress(VectorOperation::insert);
estimated_error_per_cell = distributed_error_per_cell;

GridRefinement::refine_and_coarsen_fixed_number(this->tria,
  estimated_error_per_cell,
  param[0],
  param[1],
  max_cells);

//restrict grid levels if these two parameters are given.
if(max_grid_level > 0){
if (this->tria.n_levels() > max_grid_level) {
for (typename Triangulation<dim>::active_cell_iterator
cell = this->tria.begin_active(max_grid_level);
cell != this->tria.end(max_grid_level); ++cell) {
cell->clear_refine_flag();
}
}
}
if(min_grid_level > 0){
for (typename Triangulation<dim>::active_cell_iterator
cell = this->tria.begin_active(min_grid_level);
cell != this->tria.end_active(min_grid_level); ++cell) {
cell->clear_coarsen_flag();

Wolfgang Bangerth

unread,
Oct 27, 2017, 6:12:49 PM10/27/17
to dea...@googlegroups.com
On 10/27/2017 02:13 PM, Yiyang Zhang wrote:
>
> Yes I think I am setting them in a way that is exactly same throughout
> all processes.
> Since I can check the n_active_cells() before the refinement, and also
> the refine_flags and coarsen_flags for each process. They are the same
> for each process. But after the refinement, the number of active_cells
> suddenly becomes different.
>
> I attached the refine_mesh() code.

Can you fabricate a complete testcase that we can run that shows this
phenomenon?

Yiyang Zhang

unread,
Oct 28, 2017, 1:00:07 PM10/28/17
to deal.II User Group
Hello Prof. Bangerth,

I have attached a test case.

Best,
Yiyang
shared_tria_test_result.txt
shared_tria_test.h
test_main.cpp

Timo Heister

unread,
Oct 29, 2017, 7:03:12 PM10/29/17
to dea...@googlegroups.com
Yiyang,

I am not quite sure why, but the code

estimated_error_per_cell = distributed_error_per_cell;

generates an incorrect result. You can replace it with

for(auto cell = tria.begin_active(); cell != tria.end(); ++cell)
if(cell->is_locally_owned())
estimated_error_per_cell(cell->active_cell_index()) =
distributed_error_per_cell(cell->active_cell_index());

Can you try if that works for you?

I will continue to investigate what is going on.
>> www: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.math.colostate.edu_-7Ebangerth_&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=4k7iKXbjGC8LfYxVJJXiaYVu6FRWmEjX38S7JmlS9Vw&m=Avq4CtF3shqaTzaJnB2SqcYaMgNDwAkFHlvEyo45f24&s=GT8h1eKbhfid-4BHzYVtaoniDOC2gT0gnXEmZiLsVLY&e=
>
> --
> The deal.II project is located at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dealii.org_&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=4k7iKXbjGC8LfYxVJJXiaYVu6FRWmEjX38S7JmlS9Vw&m=Avq4CtF3shqaTzaJnB2SqcYaMgNDwAkFHlvEyo45f24&s=3zUo9GeZ-rDnZ1tlZXSuEvw4gcM279OP2eh_1qCYcZs&e=
> For mailing list/forum options, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_forum_dealii-3Fhl-3Den&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=4k7iKXbjGC8LfYxVJJXiaYVu6FRWmEjX38S7JmlS9Vw&m=Avq4CtF3shqaTzaJnB2SqcYaMgNDwAkFHlvEyo45f24&s=4KjaGXKFUbi23g-CuQL1kAAPFr7IXyw1mGiKnk6eROs&e=
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_optout&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=4k7iKXbjGC8LfYxVJJXiaYVu6FRWmEjX38S7JmlS9Vw&m=Avq4CtF3shqaTzaJnB2SqcYaMgNDwAkFHlvEyo45f24&s=Vsm-Y6HviYtRxDZP6E1hzM3JP9jGx_a8utwaysqI8a0&e= .



--
Timo Heister
http://www.math.clemson.edu/~heister/

Yiyang Zhang

unread,
Oct 29, 2017, 8:04:53 PM10/29/17
to deal.II User Group
Hello Timo, 

Does the replacement code serve the same purpose as the original code? The purpose of this line is to sync the error computed by each process and summarized in the vector estimated_error_per_cell, and we should get the same estimated_error_per_cell on each process.

I wrote this part of the code myself since there is no tutorial on how to write this part... Step-18 only has a guide on how to write this part with PESTc. 

Best,
Yiyang

Timo Heister

unread,
Oct 30, 2017, 9:45:42 AM10/30/17
to dea...@googlegroups.com
> Does the replacement code serve the same purpose as the original code?

Yes.

> I wrote this part of the code myself since there is no tutorial on how to
> write this part... Step-18 only has a guide on how to write this part with
> PESTc.

Yes, it is a problem that we don't demonstrate how to do this.

Yiyang Zhang

unread,
Oct 30, 2017, 10:38:35 AM10/30/17
to deal.II User Group
Hello, Timo,

I did the replacement, the program will not collapse, but the output still shows that the number of active cells on different MPI processes are different...

By the way, I don't see why the replaced code serves the same purpose... if we use the replaced code, it seems the information from other processes are not synced into this process...

Best,
Yiyang
Reply all
Reply to author
Forward
0 new messages