Parallel distributed hp solution transfer

95 views
Skip to first unread message

Doug

unread,
Aug 17, 2019, 3:10:35 AM8/17/19
to deal.II User Group
Hello,

I am trying to use the parallel::distributed::SolutionTransfer<dim, LinearAlgebra::distributed::Vector<double>, hp::DoFHandler<dim>> class and I can't seem to use it similarly to the serial version.

I looked through the tests/distributed_grids/solution_transfer_0*.cc tests and none of them seem to be testing for the hp refinement. The code I'm working on might be a bit large. Instead, I recreated the errors by copying the tests/hp/solution_transfer.cc, which contains SolutionTransfer for hp, and modified it to use the parallel distributed version of it.

Please find attached a proposed test. Line 160-175 are commented out and does the p-refinement. Therefore, the error below is for h-refinement only, but with an hp object. Oddly enough, my code works fine for h-refinement, but I get the same error below when I do p-refinement.

110: --------------------------------------------------------
110: An error occurred in line <401> of file </home/ddong/Libraries/dealii/source/distributed/solution_transfer.cc> in function
110:     void dealii::parallel::distributed::SolutionTransfer<dim, VectorType, DoFHandlerType>::unpack_callback(const typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator&, typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus, const boost::iterator_range<__gnu_cxx::__normal_iterator<const char*, std::vector<char, std::allocator<char> > > >&, std::vector<VectorType*>&) [with int dim = 2; VectorType = dealii::LinearAlgebra::distributed::Vector<double>; DoFHandlerType = dealii::hp::DoFHandler<2, 2>; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator = dealii::TriaIterator<dealii::CellAccessor<2, 2> >; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus = dealii::Triangulation<2, 2>::CellStatus]
110: The violated condition was: 
110:     dof_handler->get_fe(fe_index).dofs_per_cell == it_dofvalues->size()
110: Additional information: 
110:     The transferred data was packed with a different number of dofs than thecurrently registered FE object assigned to the DoFHandler has.
110: --------------------------------------------------------

Now, if I uncomment those lines, and therefore do p-refinement in the attached test, I get

110: --------------------------------------------------------
110: An error occurred in line <382> of file </home/ddong/Libraries/dealii/source/distributed/solution_transfer.cc> in function
110:     void dealii::parallel::distributed::SolutionTransfer<dim, VectorType, DoFHandlerType>::unpack_callback(const typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator&, typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus, const boost::iterator_range<__gnu_cxx::__normal_iterator<const char*, std::vector<char, std::allocator<char> > > >&, std::vector<VectorType*>&) [with int dim = 2; VectorType = dealii::LinearAlgebra::distributed::Vector<double>; DoFHandlerType = dealii::hp::DoFHandler<2, 2>; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator = dealii::TriaIterator<dealii::CellAccessor<2, 2> >; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus = dealii::Triangulation<2, 2>::CellStatus]
110: The violated condition was: 
110:     cell->child(child_index)->active_fe_index() == fe_index
110: Additional information: 
110:     This exception -- which is used in many places in the library -- usually indicates that some condition which the author of the code thought must be satisfied at a certain point in an algorithm, is not fulfilled. An example would be that the first part of an algorithm sorts elements of an array in ascending order, and a second part of the algorithm later encounters an element that is not larger than the previous one.
110: 
110: There is usually not very much you can do if you encounter such an exception since it indicates an error in deal.II, not in your own program. Try to come up with the smallest possible program that still demonstrates the error and contact the deal.II mailing lists with it to obtain help.
110: --------------------------------------------------------

I went through some of the deal.II source code and it's a bit out of my depth. Please let me know if I am somehow not p-refining correctly, or if I can be of any help to fix this.

Best regards,

Doug
solution_transfer_05.cc

Wolfgang Bangerth

unread,
Aug 17, 2019, 5:21:34 PM8/17/19
to dea...@googlegroups.com, Marc Fehling

Doug,

> I am trying to use the parallel::distributed::SolutionTransfer<dim,
> LinearAlgebra::distributed::Vector<double>, hp::DoFHandler<dim>> class and I
> can't seem to use it similarly to the serial version.
>
> I looked through the tests/distributed_grids/solution_transfer_0*.cc tests and
> none of them seem to be testing for the hp refinement.

Correct. The hp case wasn't implemented so far, but most of it made it into
the 9.1 release. Some of it only happened on the master branch after the
release, so I would suggest that that's what you should be working with.

I don't recall which tests you should look at, but this functionality was
implemented by Marc Fehling, so I'm CC:ing him here.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Marc Fehling

unread,
Aug 18, 2019, 9:44:50 PM8/18/19
to deal.II User Group
Hi Doug,

when dealing with distributed meshes, ownership of cells change and we may not know which finite element lives on cells that the process got recently assigned to. Thus, we need to transfer each cell's `active_fe_index`, which we do automatically during coarsening and refinement. However, you set `active_fe_indices` after refinement happened, which works in the serial case, but no longer in the parallel one. Before executing refinement, you need to set `future_fe_indices` that describe to which finite element your cell will be assigned to, and you need to do that before refinement happened! This should resolve both issues.

Further, you initialize `LinearAlgebra::distrtibuted::Vector` objects without any parallel distribution by using this constructor. Try using one a different one.

Please see `tests/mpi/solution_transfer_04.cc` and `tests/mpi/p_coarsening_and_refinement.cc` for working examples (I guess we should provide one using an actual `SolutionTransfer` object as well), that should hopefully be applicable to your problem. This is a recently added feature: If you have any suggestions for improvement or encounter more problems, feel free to message us!

Marc

Doug

unread,
Aug 19, 2019, 12:44:03 AM8/19/19
to deal.II User Group
Wow, I did not expect to get such a quick fix of an answer over the week-end. Thank you both for taking the time to answer.

The key thing here really was to use set_future_fe_index() instead of set_active_fe_index(). Would it make sense to discontinue one of them in the future? Otherwise, I would suggest adding the first paragraph you typed up in the documentation of the hp finite element module. Test tests/mpi/p_coarsening_and_refinement.cc was indeed the perfect example to showcase the p-refinement.

Regarding the Vector constructor, I had replicated the bug even with a single core, so any bug related to this did not show up. In my code, I do give an MPI communicator.

I have modified the example to make it work. Note that in my original test, I was replicating a test where both DG and CG elements are refinement simultaneously for two different solutions. Since this set_future_fe_index() now occurs on grid refinement instead of occuring at the exact moment of set_active_fe_index(), I wouldn't expect to do the refinement for both DoF handlers at the same time.

I'd be happy to add this test to the list if that's something you are interested in. I just have to take the time to read through how to.

Anyways, it all works as expected right now. Thank you again.

Doug

Wolfgang Bangerth

unread,
Aug 19, 2019, 1:17:10 PM8/19/19
to dea...@googlegroups.com
On 8/18/19 10:44 PM, Doug wrote:
>
> I'd be happy to add this test to the list if that's something you are
> interested in. I just have to take the time to read through how to.

I think this would be fantastic -- this area is new, so there aren't
that many tests yet. Anything that does what it is supposed to do will
therefore make for a good test, and since you already know how tests
look, it shouldn't even be very difficult to convert your code example
into a working test. Want to give this a try?

Doug

unread,
Aug 19, 2019, 1:40:00 PM8/19/19
to deal.II User Group
Yes, will do. I have already fixed that example into a working code. I just need to clean it up a little and I'll submit a pull request.

Also, I'm encountering this issue where tests fail because a zero error ends up being around 1e-13 due to round-off errors. This not only happens for my test, but for a lot of other deal.II tests. I see that most of deal.II's testing framework relies on picking up tests based on a .cc + .output file, making all the tests "regression"-like. I usually have some tolerance for comparing doubles, which I thought would be the common way to do this. 

How do you provide tests in the deal.II testsuite to account for different round-off errors that might occur?

Wolfgang Bangerth

unread,
Aug 19, 2019, 1:43:31 PM8/19/19
to dea...@googlegroups.com
On 8/19/19 11:40 AM, Doug wrote:
> Yes, will do. I have already fixed that example into a working code. I
> just need to clean it up a little and I'll submit a pull request.
>
> Also, I'm encountering this issue where tests fail because a zero error
> ends up being around 1e-13 due to round-off errors. This not only
> happens for my test, but for a lot of other deal.II tests. I see that
> most of deal.II's testing framework relies on picking up tests based on
> a .cc + .output file, making all the tests "regression"-like. I usually
> have some tolerance for comparing doubles, which I thought would be the
> common way to do this.
>
> How do you provide tests in the deal.II testsuite to account for
> different round-off errors that might occur?

We use a program called 'numdiff'. If cmake finds it, it uses that
instead of regular 'diff', and then these small differences are ignored.

If you grep for 'numdiff' in the doc/ directory, there will be mention
of this program somewhere.

Doug

unread,
Aug 20, 2019, 10:00:49 PM8/20/19
to deal.II User Group
Hello Marc,

I am trying to get the test working and unfortunately it is going a lot less smoothly than expected. Please find attached the test as solution_transfer_05.cc. Only h-refinement occurs. I have attached the before and after grids, which look OK. It is still that error

8134: --------------------------------------------------------
8134: An error occurred in line <405> of file </home/ddong/Libraries/dealii/source/distributed/solution_transfer.cc> in function
8134:     void dealii::parallel::distributed::SolutionTransfer<dim, VectorType, DoFHandlerType>::unpack_callback(const typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator&, typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus, const boost::iterator_range<__gnu_cxx::__normal_iterator<const char*, std::vector<char, std::allocator<char> > > >&, std::vector<VectorType*>&) [with int dim = 3; VectorType = dealii::LinearAlgebra::distributed::Vector<double>; DoFHandlerType = dealii::hp::DoFHandler<3, 3>; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator = dealii::TriaIterator<dealii::CellAccessor<3, 3> >; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus = dealii::Triangulation<3, 3>::CellStatus]
8134: The violated condition was: 
8134:     dof_handler->get_fe(fe_index).dofs_per_cell == it_dofvalues->size()
8134: Additional information: 
8134:     The transferred data was packed with a different number of dofs than the currently registered FE object assigned to the DoFHandler has.
8134: --------------------------------------------------------

Note that if I decide to refine the cells (uncomment line 150) instead of coarsening them (line 149), the solution transfer is successful. Playing around with various refinement/coarsening, it seems a bit arbitrary when it does work and when it won't. Am I calling anything out of order?

Doug

On Sunday, August 18, 2019 at 9:44:50 PM UTC-4, Marc Fehling wrote:
solution_transfer_05.cc
after.eps
before.eps

Marc Fehling

unread,
Aug 21, 2019, 5:39:11 AM8/21/19
to deal.II User Group
Hi Doug!


On Wednesday, August 21, 2019 at 4:00:49 AM UTC+2, Doug wrote:
8134:     void dealii::parallel::distributed::SolutionTransfer<dim, VectorType, DoFHandlerType>::unpack_callback(const typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator&, typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus, const boost::iterator_range<__gnu_cxx::__normal_iterator<const char*, std::vector<char, std::allocator<char> > > >&, std::vector<VectorType*>&) [with int dim = 3; VectorType = dealii::LinearAlgebra::distributed::Vector<double>; DoFHandlerType = dealii::hp::DoFHandler<3, 3>; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::cell_iterator = dealii::TriaIterator<dealii::CellAccessor<3, 3> >; typename dealii::parallel::distributed::Triangulation<dim, DoFHandlerType:: space_dimension>::CellStatus = dealii::Triangulation<3, 3>::CellStatus]

It seems that the assertion fails only in the 3D case, which is quite interesting. Are you using the deal.II-9.1 release version, or do you work with the current master branch? I'll try to reproduce the error once I'm at my desk.

It looks like either p::d::SolutionTransfer picks the wrong finite element while packing the solution on cells to be coarsened, or hp::DoFHandler chooses the wrong finite element for the parent cell in case of coarsening. I'll investigate.

Marc

Doug

unread,
Aug 21, 2019, 10:22:51 AM8/21/19
to deal.II User Group
I am using the master version from 2 days ago. Commit c4c4e5209a. Actually, it also fails for the 2D, but in a different scenario. I run this for mpirun=1 and mpirun=4. 

It fails in 2D for mpirun=1.debug, with the same error. It fails for 3D for mpirun=4.debug, mpirun=1.release. And fully succeeds for mpirun=4.release...

I have attached the output from ctest.
ctest.log

Marc Fehling

unread,
Aug 23, 2019, 8:32:33 PM8/23/19
to deal.II User Group
Hi Doug!

Your scenario indeed revealed a bug: Currently, we set and send `active_fe_indices` based on the refinement flags on the Triangulation object. However, p4est has the last word on deciding which cells will be refined -- and in your specific scenario p4est makes use of it. I came up with a fix that should resolve this issue. Thank you for providing us with this example!

I'll open a pull request once my local testsuite passes.

Marc

Wolfgang Bangerth

unread,
Aug 24, 2019, 9:25:06 PM8/24/19
to dea...@googlegroups.com
On 8/23/19 6:32 PM, Marc Fehling wrote:
>
> Your scenario indeed revealed a bug: Currently, we set and send
> `active_fe_indices` based on the refinement flags on the Triangulation object.
> However, p4est has the last word on deciding which cells will be refined

That's ultimately true, but we try to match what p4est expects in the
Triangulation::prepare_coarsening_and_refinement() function. Are you calling
this function after you decide which cells *you* want to refine/coarsen and
before you execute the refinement/coarsening?

Marc Fehling

unread,
Aug 25, 2019, 9:44:29 AM8/25/19
to deal.II User Group
Hi Doug! Hi Wolfgang!


On Sunday, August 25, 2019 at 3:25:06 AM UTC+2, Wolfgang Bangerth wrote:
On 8/23/19 6:32 PM, Marc Fehling wrote:
>
> Your scenario indeed revealed a bug: Currently, we set and send
> `active_fe_indices` based on the refinement flags on the Triangulation object.
> However, p4est has the last word on deciding which cells will be refined

That's ultimately true, but we try to match what p4est expects in the
Triangulation::prepare_coarsening_and_refinement() function. Are you calling
this function after you decide which cells *you* want to refine/coarsen and
before you execute the refinement/coarsening?

We've set all refinement and coarsening flags on the p::d::Triangulation, called prepare_coarsening_and_refinement(), and then executed refinement. The transfer of active_fe_indices will be prepared during the pre_distributed_refinement signal, but is executed later after p4est performed refinement on its forest. We utilize the CellStatus flags for transfer.

In the scenario provided by Doug, there are neighboring cells that are either flagged for coarsening or for refinement. I would guess that there are differences in enforcing 2:1 hanging node conditions between p4est and deal.II.

I came up with a fix for this issue in the following PR #8637 that uses the CellStatus flags to determine active_fe_indices while coarsening.

Marc

Doug

unread,
Aug 26, 2019, 9:41:11 PM8/26/19
to deal.II User Group
Thank you very much for the quick fix! Looking forward to pull this once it goes through all the checks.

Doug

Marc Fehling

unread,
Aug 27, 2019, 3:51:20 AM8/27/19
to deal.II User Group
Hi Doug!


On Tuesday, August 27, 2019 at 3:41:11 AM UTC+2, Doug wrote:
Thank you very much for the quick fix! Looking forward to pull this once it goes through all the checks.

The patch has been merged. Let me know if this fix does the trick for you.

We introduced a test named `tests/mpi/solution_transfer_05.cc` that may be in a name conflict with the test you were preparing. I'm sorry about that if this is the case. Please adjust the filenames of your test accordingly.

Marc

Doug

unread,
Aug 31, 2019, 11:51:39 AM8/31/19
to deal.II User Group

On Tuesday, August 27, 2019 at 3:51:20 AM UTC-4, Marc Fehling wrote:

The patch has been merged. Let me know if this fix does the trick for you.

It does! Thank you again for the quick support.

Doug 
Reply all
Reply to author
Forward
0 new messages