FETools::Interpolate bug?

467 views
Skip to first unread message

Denis Davydov

unread,
Mar 17, 2014, 6:31:20 PM3/17/14
to dea...@googlegroups.com
Dear all,

it seems i made FETools interpolate to crash when using with objects from shared triangulation:

FETools::interpolate (dof_handler_1,
   locally_relevant_vector_1,
           dof_handler_2,
           hanging_nodes_constraints_2,
           local_vector_2);

locally_relevant_vector_1 is initialised using IndexSet of locally_relevant_dofs_1 (which will contain all DoFs for shared triangulation)
DoFTools::extract_locally_relevant_dofs (dof_handler_1,
                                                  locally_relevant_dofs_1);
and an IndexSet of locally owned DOFs is obtain via a custom function, which is discussed in another thread, but seems to work ok.


--------------------------------------------------------
An error occurred in line <68> of file </Users/davydden/libs-sources/dealii/deal.II-complex-head/deal.II/source/lac/trilinos_vector_base.cc> in function
    dealii::TrilinosWrappers::internal::VectorReference::operator double() const
The violated condition was:
    local_index >= 0
The name and call sequence of the exception was:
    ExcAccessToNonLocalElement (index, vector.vector->Map().MinMyGID(), vector.vector->Map().MaxMyGID())
Additional Information:
You tried to access element 7 of a distributed vector, but this element is not stored on the current processor. Note: The elements stored on the current processor are within the range 0 through 6 but Trilinos vectors need not store contiguous ranges on each processor, and not every element in this range may in fact be stored locally.

#0  2   libdeal_II.g.8.2.pre.dylib          0x0000000108ad3e26 dealii::TrilinosWrappers::internal::VectorReference::operator double() const   426: 2   libdeal_II.g.8.2.pre.dylib          0x0000000108ad3e26 dealii::TrilinosWrappers::internal::VectorReference::operator double() const
#1  3   libdeal_II.g.8.2.pre.dylib          0x00000001070521ff void dealii::FETools::interpolate<2, 2, dealii::DoFHandler, dealii::DoFHandler, dealii::TrilinosWrappers::MPI::Vector, dealii::TrilinosWrappers::MPI::Vector>(dealii::DoFHandler<2, 2> const&, dealii::TrilinosWrappers::MPI::Vector const&, dealii::DoFHandler<2, 2> const&, dealii::ConstraintMatrix const&, dealii::TrilinosWrappers::MPI::Vector&)   3671: 3   libdeal_II.g.8.2.pre.dylib          0x00000001070521ff void dealii::FETools::interpolate<2, 2, dealii::DoFHandler, dealii::DoFHandler, dealii::TrilinosWrappers::MPI::Vector, dealii::TrilinosWrappers::MPI::Vector>(dealii::DoFHandler<2, 2> const&, dealii::TrilinosWrappers::MPI::Vector const&, dealii::DoFHandler<2, 2> const&, dealii::ConstraintMatrix const&, dealii::TrilinosWrappers::MPI::Vector&)
#2  4   problem                             0x00000001058e8393 main   1063: 4   problem                             0x00000001058e8393 main
#3  5   libdyld.dylib                       0x00007fff90a455fd start   1: 5   libdyld.dylib                       0x00007fff90a455fd start
#4  6   ???                                 0x0000000000000001 0x0   1: 6   ???                                 0x0000000000000001 0x0


Please see attached minimal example which triggers this error when run with mpirun -np 2.


Kind regards,
Denis.
problem.cc

Denis Davydov

unread,
Mar 18, 2014, 12:06:49 PM3/18/14
to dea...@googlegroups.com
I was trying to debug it using PTP Eclipse today, but it seems some of my settings are wrong as I can't "step in" the deal.ii functions even though everything is in the debug mode. 
So if anyone has a working environment with parallel debugging working, please, have a look.

Regards,
Denis.

Timo Heister

unread,
Mar 18, 2014, 2:27:32 PM3/18/14
to dea...@googlegroups.com
I will take a look at that soon.

Some information to help with parallel debugging:
https://code.google.com/p/dealii/wiki/FrequentlyAskedQuestions#How_do_I_debug_MPI_programs?

To get your started:
please print the locally owned and locally relevant IndexSets of both
CPUs. Is index 7 in there?
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Timo Heister
http://www.math.clemson.edu/~heister/

Denis Davydov

unread,
Mar 18, 2014, 3:58:13 PM3/18/14
to dea...@googlegroups.com
Dear Timo,


On Tuesday, March 18, 2014 7:27:32 PM UTC+1, Timo Heister wrote:
I will take a look at that soon.

Thank you!
I will try using plain Gdb tonight and tomorrow (without any Eclipse PTP stuff).
 

To get your started:
please print the locally owned and locally relevant IndexSets of both
CPUs. Is index 7 in there?


this is the output of a slightly modified version (attached):

project from locally_relevant(dof_1) to locally_owned(dof_2)
p=0
owned_1: {[0,2]}
relevant_1: {[0,5]}
n_dofs_1: 6
owned_2: {[0,6]}
n_dofs_2: 15
p=1
owned_1: {[3,5]}
relevant_1: {[0,5]}
n_dofs_1: 6
owned_2: {[7,14]}
n_dofs_2: 15 

Note that i did not yet try to implement a proper function to get locally_relevant DoFs, but for 2 elements (which is the case in this test problem), 
that should anyway give all DoFs. Thus, i don't think it will make any difference.

p.s. the current error is:
You tried to access element 0 of a distributed vector, but this element is not stored on the current processor. Note: The elements stored on the current processor are within the range 7 through 14

Kind regards,
Denis.
  
problem.cc

Denis Davydov

unread,
Mar 19, 2014, 7:00:57 AM3/19/14
to dea...@googlegroups.com
Still can't make any debug (be it serial or parallel) of deal.II to work.... :(

 
But i think i know what is the problem. For shared triangulation:
triangulation.locally_owned_subdomain() == numbers::invalid_subdomain_id.
So interpolation actually visits all cells, be it those owned by current processor or not.
And then tries to write to the locally owned vector on all cells, thus the error.

Below is a self-explanatory output of the debug program (attached): 

project from locally_relevant(dof_1) to locally_owned(dof_2)
p=0
owned_1: {[0,2]}
relevant_1: {[0,5]}
n_dofs_1: 6
owned_2: {[0,6]}
n_dofs_2: 15
 locally owned subdomain_id = 4294967295
 cells considered for interpolation:
id of cell1: 0_0: dofs of cell2: 0 7 1 2 3 8 4 5 6
id of cell1: 1_0: dofs of cell2: 7 9 2 10 8 11 12 13 14
p=1
owned_1: {[3,5]}
relevant_1: {[0,5]}
n_dofs_1: 6
owned_2: {[7,14]}
n_dofs_2: 15
 locally owned subdomain_id = 4294967295
 cells considered for interpolation:
id of cell1: 0_0: dofs of cell2: 0 7 1 2 3 8 4 5 6
id of cell1: 1_0: dofs of cell2: 7 9 2 10 8 11 12 13 14

 cells subdomain_ids are: 0 1
invalid subdomain id: 4294967295

Another thing i notice is that
cell2->get_dof_indices(dofs);
gives back all DoFs in case of a shared triangulation. Don't know if this is a desired behaviour since it might happen that
on a particular cell with a given subdomain id only part of DoFs are actually local on this processor.

Regards,
Denis.

Denis Davydov

unread,
Mar 19, 2014, 7:02:13 AM3/19/14
to dea...@googlegroups.com
sorry, forgot to attach the file.
problem.cc

Timo Heister

unread,
Mar 19, 2014, 2:58:59 PM3/19/14
to dea...@googlegroups.com
> So interpolation actually visits all cells, be it those owned by current
> processor or not.

Well, you will be running into this problem in many places. You either:
a) implement new functions for interpolate() etc. that take a
subdomainid as an argument and only look at the specified subdomain.
b) change the definition of is_ghost()/is_artificial(). This would
require what I suggested earlier (creating a new type of
Triangulation).

Option a) is certainly easier for now (just add one function), but
there are many places in the library that behave similarly and I don't
like the idea of introducing new variants of many (dozens of?)
functions.

Option b) is actually not that much work, because you don't need to
implement a lot in the new Triangulation class. The only reason to
have it is so that we can detect that this type is used and report
correct values for is_ghost(), is_artificial() etc.. I think the only
thing to implement in the new class is:
1. locally_owned_elements
2. after refining the mesh call partition(), then mark all cells
except own and ghost as artificial by setting the subdomain to
artificial.

> And then tries to write to the locally owned vector on all cells, thus the
> error.

No, writing is actually not the problem. You are not allowed to read
from random elements, but you can write anywhere.


> Another thing i notice is that
>
> cell2->get_dof_indices(dofs);
>
> gives back all DoFs in case of a shared triangulation.

Yes that is okay, but this won't be true with my suggestion above.

Denis Davydov

unread,
Mar 19, 2014, 3:35:55 PM3/19/14
to dea...@googlegroups.com
Dear Timo, 

On Wednesday, March 19, 2014 7:58:59 PM UTC+1, Timo Heister wrote:
> So interpolation actually visits all cells, be it those owned by current
> processor or not.

Well, you will be running into this problem in many places. You either:
a) implement new functions for interpolate() etc. that take a
subdomainid as an argument and only look at the specified subdomain.
b) change the definition of is_ghost()/is_artificial(). This would
require what I suggested earlier (creating a new type of
Triangulation).

should not if condition at the first cycle  be changed to include  is_artificial()  to make it work?


       if ((cell1->subdomain_id() == subdomain_id)
          ||
          (subdomain_id == numbers::invalid_subdomain_id))

otherwise for shared triangulation it will go through all cells on every processor, right?

 

Option a) is certainly easier for now (just add one function), but
there are many places in the library that behave similarly and I don't
like the idea of introducing new variants of many (dozens of?)
functions.

true
 

Option b) is actually not that much work, because you don't need to
implement a lot in the new Triangulation class. The only reason to
have it is so that we can detect that this type is used and report
correct values for is_ghost(), is_artificial() etc.. I think the only
thing to implement in the new class is:
1. locally_owned_elements

locally owned cells? 
I guess in general after calling GridTools::partition_triangulation (n_mpi_processes, triangulation)
user did not decide yet which processor uses which part of triangulation. 
Supposedly this derived Triangulation class could receive an integer in constructor 
(in a normal cases - this_mpi_process) 
to be used as an output of triangulation.locally_owned_subdomain() ? 
Is it what you mean?
  
2. after refining the mesh call partition(), then mark all cells

Perhaps execute_coarsening_and_refinement() could be overwritten to call the parent class function and
GridTools::partition_triangulation (N, triangulation) ? 
It seems one would have to provide the number of partitions in constructor as well (keep it const) to use here?
 
except own and ghost as artificial by setting the subdomain to
artificial.


So the same question of effectively marking ghost cells...

Kind regards,
Denis. 

Timo Heister

unread,
Mar 19, 2014, 3:54:30 PM3/19/14
to dea...@googlegroups.com
> should not if condition at the first cycle be changed to include
> is_artificial() to make it work?
>
>
> if ((cell1->subdomain_id() == subdomain_id)
> ||
> (subdomain_id == numbers::invalid_subdomain_id))
>
> otherwise for shared triangulation it will go through all cells on every
> processor, right?

Not sure I understand. You only want to look at own cells.

>> Option b) is actually not that much work, because you don't need to
>> implement a lot in the new Triangulation class. The only reason to
>> have it is so that we can detect that this type is used and report
>> correct values for is_ghost(), is_artificial() etc.. I think the only
>> thing to implement in the new class is:
>> 1. locally_owned_elements
>
> locally owned cells?

Well, I was thinking about locally_owned_dofs, so some code needs to
be put into DoFHandler.

> I guess in general after calling GridTools::partition_triangulation
> (n_mpi_processes, triangulation)
> user did not decide yet which processor uses which part of triangulation.
> Supposedly this derived Triangulation class could receive an integer in
> constructor
> (in a normal cases - this_mpi_process)
> to be used as an output of triangulation.locally_owned_subdomain() ?
> Is it what you mean?

Correct, my idea is to have a class parallel::shared::Triangulation
that takes an MPI_Comm and internally handles partitioning (mapped to
MPI_rank). If you implement the handling of ghost_cells() correctly
(in the same way we do for distributed::Triangulation), all algorithms
like interpolate() will work correctly.

>> 2. after refining the mesh call partition(), then mark all cells
>
> Perhaps execute_coarsening_and_refinement() could be overwritten to call the
> parent class function and
> GridTools::partition_triangulation (N, triangulation) ?

Yes, exactly. Similar to what we do in distributed::Tria.

> It seems one would have to provide the number of partitions in constructor
> as well (keep it const) to use here?

No, just look at the MPI_Comm.

>> except own and ghost as artificial by setting the subdomain to
>> artificial.
>
> So the same question of effectively marking ghost cells...

Correct. But this would happen once after refinement and not every
time you need to know if a cell is a ghost cell. We do this in
distributed::Tria using:
1. mark all vertices of own cells as "interesting"
2. iterate over all cells that are not ours, if one of the vertices is
"interesting", this is a ghost cell.
3. all other cells are artificial

Denis Davydov

unread,
Mar 19, 2014, 5:14:36 PM3/19/14
to dea...@googlegroups.com
On Wednesday, March 19, 2014 8:54:30 PM UTC+1, Timo Heister wrote:
>> Option b) is actually not that much work, because you don't need to 
>> implement a lot in the new Triangulation class. The only reason to 
>> have it is so that we can detect that this type is used and report 
>> correct values for is_ghost(), is_artificial() etc.. I think the only 
>> thing to implement in the new class is: 
>> 1. locally_owned_elements 

> locally owned cells? 

Well, I was thinking about locally_owned_dofs, so some code needs to 
be put into DoFHandler. 

I guess you mean to do dynamic casting in 
const IndexSet& DoFHandler< dim, spacedim >::locally_owned_dofs ( const unsigned int  level )  const
and call an internal function (the one I wrote in the other thread) if it is of parallel::shared::Triangulation type?
 

> I guess in general after calling GridTools::partition_triangulation 
> (n_mpi_processes, triangulation) 
> user did not decide yet which processor uses which part of triangulation. 
> Supposedly this derived Triangulation class could receive an integer in 
> constructor 
> (in a normal cases - this_mpi_process) 
> to be used as an output of triangulation.locally_owned_subdomain() ? 
> Is it what you mean? 

Correct, my idea is to have a class parallel::shared::Triangulation 
that takes an MPI_Comm and internally handles partitioning (mapped to 

I wonder at what point should I actually call GridTools::partition_triangulation ?
Does DoFHandler calls some member functions of triangulation when 
it is associated to it ?
Distributed triangulation is a huge class, i don't see the answer to this question immediately.... 
 
MPI_rank). If you implement the handling of ghost_cells() correctly 
Supposedly this is an extension of 
tria_accessor.templates.h  with dynamic casting to check for a new class?
 
(in the same way we do for distributed::Triangulation), all algorithms 
like interpolate() will work correctly. 


>> except own and ghost as artificial by setting the subdomain to 
>> artificial. 

> So the same question of effectively marking ghost cells... 

Correct. But this would happen once after refinement and not every 
time you need to know if a cell is a ghost cell. We do this in 
distributed::Tria using: 
1. mark all vertices of own cells as "interesting" 
2. iterate over all cells that are not ours, if one of the vertices is 
"interesting", this is a ghost cell. 
3. all other cells are artificial 

if i understand correct, u do cell->subdomain_id( numbers::artificial_subdomain_id) for those cells,
everything else is taken care in tria_accessor.templates.h?

Regards,
Denis.

Timo Heister

unread,
Mar 19, 2014, 5:43:52 PM3/19/14
to dea...@googlegroups.com
>> Well, I was thinking about locally_owned_dofs, so some code needs to
>> be put into DoFHandler.
>
>
> I guess you mean to do dynamic casting in
>
> const IndexSet& DoFHandler< dim, spacedim >::locally_owned_dofs ( const
> unsigned int level ) const
>
> and call an internal function (the one I wrote in the other thread) if it is
> of parallel::shared::Triangulation type?

Yes. If you look into DoFHandler this is handled using Policy classes.
Adding another policy requires writing a couple of functions (and they
basically call existing functionality).

>> Correct, my idea is to have a class parallel::shared::Triangulation
>> that takes an MPI_Comm and internally handles partitioning (mapped to
>
>
> I wonder at what point should I actually call
> GridTools::partition_triangulation ?
> Does DoFHandler calls some member functions of triangulation when
> it is associated to it ?
> Distributed triangulation is a huge class, i don't see the answer to this
> question immediately....

You basically overload create_triangulation and
execute_coarsening_and_refinement, call the methods in the base class,
then call partition_triangulation, and finally unsetting the
subdomainid for non-ghost cells (so that they are artificial).

>> MPI_rank). If you implement the handling of ghost_cells() correctly
>
> Supposedly this is an extension of
> tria_accessor.templates.h with dynamic casting to check for a new class?

Yep. If you look there the logic currently is:
if we can cast Tria to distributed::Tria, then look at subdomain to
determine if artificial/ghost/own, otherwise ignore the subdomain. You
would also try casting to the new type.

> if i understand correct, u do cell->subdomain_id(
> numbers::artificial_subdomain_id) for those cells,
> everything else is taken care in tria_accessor.templates.h?

Yep.

Denis Davydov

unread,
Mar 20, 2014, 7:05:44 AM3/20/14
to dea...@googlegroups.com
Dear Timo,


On Wednesday, March 19, 2014 10:43:52 PM UTC+1, Timo Heister wrote:
>> Well, I was thinking about locally_owned_dofs, so some code needs to
>> be put into DoFHandler.
>
>
> I guess you mean to do dynamic casting in
>
> const IndexSet& DoFHandler< dim, spacedim >::locally_owned_dofs ( const
> unsigned int  level )  const
>
> and call an internal function (the one I wrote in the other thread) if it is
> of parallel::shared::Triangulation type?

Yes. If you look into DoFHandler this is handled using Policy classes.
Adding another policy requires writing a couple of functions (and they
basically call existing functionality).

Could you please elaborate on it?
So far i kept it in outside functions...


In any case, I experience other problems, namely some errors coming from PETSc even though I don't create any of its objects:

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range

this occurs only for parallel runs. Stack is empty, no clue what is wrong...

Current patch and the example problem is attached. Do have any ideas where to start?
Is what i have written so-far look ok to you?


Kind regards,
Denis.

 
problem_all.cc
parallel-shared.patch

Timo Heister

unread,
Mar 20, 2014, 8:44:57 AM3/20/14
to dea...@googlegroups.com
>> Yes. If you look into DoFHandler this is handled using Policy classes.
>> Adding another policy requires writing a couple of functions (and they
>> basically call existing functionality).
>
> Could you please elaborate on it?
> So far i kept it in outside functions...

DoFHandler does different things depending on the kind of
Triangulation used. Take a look at
internal::DoFHandler::Policy::PolicyBase.

Note that we need to do some extra stuff to get the hp::DoFHandler to
work with this.

> In any case, I experience other problems, namely some errors coming from
> PETSc even though I don't create any of its objects:
>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> this occurs only for parallel runs. Stack is empty, no clue what is wrong...

PETSc takes over the crash handler. You have a SEGV somewhere in your
code. Try to run your code in a debugger (and learning to use valgrind
is also a good idea).

> Is what i have written so-far look ok to you?

Yes, exactly how I thought it should work!

Why did you change the logic for get_dof_indices()? Asking for DoFs of
artificial cells doesn't make much sense, does it? I know that the
information is there, but it might be cleaner to not allow asking for
this information. Do you agree?

dof_tools.cc: leave this Assert in there. It never hurts to have extra checks.


Do you have write access to the deal.II repo? I think it makes sense
to create a branch for this.

Denis Davydov

unread,
Mar 20, 2014, 9:03:45 AM3/20/14
to dea...@googlegroups.com
On Thursday, March 20, 2014 1:44:57 PM UTC+1, Timo Heister wrote:
DoFHandler does different things depending on the kind of
Triangulation used. Take a look at
internal::DoFHandler::Policy::PolicyBase.

will have a look...
 

Note that we need to do some extra stuff to get the hp::DoFHandler to
work with this.

PETSc takes over the crash handler. You have a SEGV somewhere in your
code. Try to run your code in a debugger (and learning to use valgrind
is also a good idea).

i used valgrind and gdb for my personal projects, but i struggle making Gdb debug a shared library (deal.II). 
After stepping in main() of the program, gdb shows nothing for "info shared", even though the executable is linked against the debug version of deal.II.
Could you briefly tell your workflow/steps/Gdb settings to debug deal.II with Gdb from a command line?
 

Why did you change the logic for get_dof_indices()? Asking for DoFs of
artificial cells doesn't make much sense, does it? I know that the
information is there, but it might be cleaner to not allow asking for
this information. Do you agree?

Not sure. I know that in future i need to be able to move vertices of triangulation (a-la Step-18), 
from the global displacement vector, of course. To do so, supposedly, one should be able to ask 
global DoF indices of "artificial" cells of a parallel::shared::triangulation and move all nodes on each processor.

 

dof_tools.cc: leave this Assert in there. It never hurts to have extra checks.

that will break "get_subdomain_association" function, which should work fine since we 
deal with, effectively, normal triangulation where all cells are available for all processors.
that's the reason why i removed it.
And I use get_subdomain_association to create locally owned IndexSet. 
 

Do you have write access to the deal.II repo? I think it makes sense
to create a branch for this.

yes, i do. Although so far i submitted only to complex branch, supposedly the write access is not restricted to operations. 
But in order not to mess things up, i would appreciate if you create a branch for this task and i will submit work there.

Regards,
Denis.
 

Timo Heister

unread,
Mar 20, 2014, 9:51:00 AM3/20/14
to dea...@googlegroups.com
> i used valgrind and gdb for my personal projects, but i struggle making Gdb
> debug a shared library (deal.II).
> After stepping in main() of the program, gdb shows nothing for "info
> shared", even though the executable is linked against the debug version of
> deal.II.
> Could you briefly tell your workflow/steps/Gdb settings to debug deal.II
> with Gdb from a command line?

No settings/setup or anything. You run your application linking to
deal.II and you can step into the code as is.

What happens if you step into a deal.II function? Is gdb complaining
about missing source?

Are you building and installing deal.II as an out of source build (I am)?

>> dof_tools.cc: leave this Assert in there. It never hurts to have extra
>> checks.
>
> that will break "get_subdomain_association" function, which should work fine
> since we
> deal with, effectively, normal triangulation where all cells are available
> for all processors.
> that's the reason why i removed it.
> And I use get_subdomain_association to create locally owned IndexSet.

Now that you create ghost cells it should be a lot easier to get the
locally owned IndexSet (without creating a std::vector<dofindex> for
ALL DoFs) without using that function.

> yes, i do. Although so far i submitted only to complex branch, supposedly
> the write access is not restricted to operations.
> But in order not to mess things up, i would appreciate if you create a
> branch for this task and i will submit work there.

Will do.

Timo Heister

unread,
Mar 20, 2014, 11:01:44 AM3/20/14
to dea...@googlegroups.com
>> yes, i do. Although so far i submitted only to complex branch, supposedly
>> the write access is not restricted to operations.
>> But in order not to mess things up, i would appreciate if you create a
>> branch for this task and i will submit work there.
>
> Will do.

svn co https://svn.dealii.org/branches/branch_sharedtria

Denis Davydov

unread,
Mar 20, 2014, 2:11:06 PM3/20/14
to dea...@googlegroups.com
On Thursday, March 20, 2014 2:51:00 PM UTC+1, Timo Heister wrote:
> i used valgrind and gdb for my personal projects, but i struggle making Gdb
> debug a shared library (deal.II).
> After stepping in main() of the program, gdb shows nothing for "info
> shared", even though the executable is linked against the debug version of
> deal.II.
> Could you briefly tell your workflow/steps/Gdb settings to debug deal.II
> with Gdb from a command line?

No settings/setup or anything. You run your application linking to
deal.II and you can step into the code as is.

for whatever reason i can't do it (os-x, gdb7.7).
 

What happens if you step into a deal.II function? Is gdb complaining
about missing source?

just noticed that gdb can't even set breakpoint inside functions defined in the file I compile (e.g. locally_owned_dofs_with_subdomain):
    Cannot insert breakpoint 1.
    Cannot access memory at address 0x66b9


@all   anyone have positive experience debugging deal.II on Mavericks 10.8.2?
 

Are you building and installing deal.II as an out of source build (I am)?

yes, I am. 


> yes, i do. Although so far i submitted only to complex branch, supposedly
> the write access is not restricted to operations.
> But in order not to mess things up, i would appreciate if you create a
> branch for this task and i will submit work there.

Will do.

thanks, used it already... 


Stepping line-by-line, the problem appears during renumbering:

 DoFRenumbering::subdomain_wise (dof_handler);
(gdb) s

Program received signal SIGSEGV, Segmentation fault.
0x0000000102cb7864 in ?? ()

 (gdb) bt
#0  0x0000000102cb7864 in ?? ()
#1  0x00000001175a5e00 in ?? ()
#2  0x0000000000000005 in ?? ()
#3  0x00007fff5fbfde50 in ?? ()
#4  0x00007fff5fbfe630 in ?? ()
#5  0xffffffffffffffff in ?? ()
#6  0x00007fff5fbfe3f0 in ?? ()
#7  0x00007fff5fbfe630 in ?? ()
#8  0x0000000000000029 in ?? ()
#9  0x0000000f00000002 in ?? ()
#10 0x00007fff5fbfe3f0 in ?? ()
#11 0x00007fff5fbfe630 in ?? ()
#12 0x0000000000000002 in ?? ()
#13 0x000000011882dd50 in ?? ()
#14 0x000000011882dd60 in ?? ()
#15 0x000000011882dd60 in ?? ()
#16 0x0000000118d02850 in ?? ()
#17 0x00000001175a1000 in ?? ()
#18 0x0000000000000000 in ?? ()

which again uses get_subdomain_association(), so i think that Assert (cell->is_artificial(),...) should be dropped.

I have a feeling, the problem is somewhere in distribute_dofs. It leads back to "Policy" class...

Supposedly now for distributing is done around line 249:

for (; cell != endc; ++cell)
            if ((subdomain_id == numbers::invalid_subdomain_id)
                ||
                (cell->subdomain_id() == subdomain_id))
              next_free_dof
                = Implementation::distribute_dofs_on_cell (dof_handler,
                                                           cell,
                                                           next_free_dof);

which will not distribute DOFs on all cells as it should... 

Do you propose to derive a policy ParallelShared from  Sequential and re-implement some functions (which?)?
If everything have to be changed, then one can derive it from PolicyBase, of course, but I hope it is not the case...

What i miss is at what point Policy is connected to Triangulation class, where does it happen?

Timo Heister

unread,
Mar 20, 2014, 7:12:52 PM3/20/14
to dea...@googlegroups.com
> for whatever reason i can't do it (os-x, gdb7.7).

Well, switch to linux? :-)

> Do you propose to derive a policy ParallelShared from Sequential and
> re-implement some functions (which?)?
> If everything have to be changed, then one can derive it from PolicyBase, of
> course, but I hope it is not the case...

Good question. Either you keep track of all cells (and have no
artificial cells), or you need to base your implementation on the
parallel one that is used with p4est that ignores artificial cells and
does some MPI communication. The latter has the advantage that it
scales a lot better with large number of unknowns.

> What i miss is at what point Policy is connected to Triangulation class,
> where does it happen?

Not at all. It gets called by dof_handler.distribute_dofs()

Denis Davydov

unread,
Mar 21, 2014, 7:05:57 AM3/21/14
to dea...@googlegroups.com


On Friday, March 21, 2014 12:12:52 AM UTC+1, Timo Heister wrote:
> Do you propose to derive a policy ParallelShared from  Sequential and
> re-implement some functions (which?)?
> If everything have to be changed, then one can derive it from PolicyBase, of
> course, but I hope it is not the case...

Good question. Either you keep track of all cells (and have no
artificial cells), or you need to base your implementation on the
parallel one that is used with p4est that ignores artificial cells and
does some MPI communication. The latter has the advantage that it
scales a lot better with large number of unknowns.

Frankly i don't want to make my life more complicated that it is already ;)

Do you think the following logic will  work:

1) we stick to the fact "absence" of artificial cells and do distribute DoFs on every cell on each processor (thus everything is consistent by default).
Or, stick to the "artificial cells == cells on current processor which no nothing about DoFs". 
2) invalid_subdomain_id  returns the proper subdomain_id (this_mpi_process), assigned in the constructor. So effectively, that is the only difference from normal Triangulation. 
Plus automatic calls of GridTools::partition_triangulation() on creating or coarsening/refinement. 
3) a policy for shared triangulation is derived from the normal one with a couple of tweaks to calculate properly number_cache (locally_owned_dofs, locally_relevant_dofs == all_dofs, etc)

hopefully that logic works straight away for hp-refinement and FETools::interpolate. 
 

> What i miss is at what point Policy is connected to Triangulation class,
> where does it happen?

Not at all. It gets called by dof_handler.distribute_dofs()

I guess what I was looking for is this part of the constructor of DoFHandler:

   if (dynamic_cast<const parallel::distributed::Triangulation< dim, spacedim >*>
       (&tria)
      == 0)
     policy.reset (new internal::DoFHandler::Policy::Sequential<dim,spacedim>());
   else
     policy.reset (new internal::DoFHandler::Policy::ParallelDistributed<dim,spacedim>()); 

where a policy is created depending on the tria object. Now everything is clear with regards to this question. 

Timo Heister

unread,
Mar 21, 2014, 8:11:32 AM3/21/14
to dea...@googlegroups.com
> 3) a policy for shared triangulation is derived from the normal one with a
> couple of tweaks to calculate properly number_cache (locally_owned_dofs,
> locally_relevant_dofs == all_dofs, etc)
>
> hopefully that logic works straight away for hp-refinement and
> FETools::interpolate.

Argh. Now we are running into the next problem: the policy stuff only
works for non-hp DoFHandlers so far. I don't know what the current
status of merging hp::DoFHandler into DoFHandler is. So if you don't
want to take up that task, you need to leave everything as is (no
artificial cells, dofs are known on all machines).

Denis Davydov

unread,
Mar 21, 2014, 8:49:28 AM3/21/14
to dea...@googlegroups.com
On Friday, March 21, 2014 1:11:32 PM UTC+1, Timo Heister wrote:
> 3) a policy for shared triangulation is derived from the normal one with a
> couple of tweaks to calculate properly number_cache (locally_owned_dofs,
> locally_relevant_dofs == all_dofs, etc)
>
> hopefully that logic works straight away for hp-refinement and
> FETools::interpolate.

Argh. Now we are running into the next problem: the policy stuff only
works for non-hp DoFHandlers so far. I don't know what the current
status of merging hp::DoFHandler into DoFHandler is. So if you don't

Perhaps it is not a problem. 
For Sequential-like policy I copy-paste-modify Sequential<dim,spacedim>::distribute_dofs() and renumber_dofs()  to store proper values in 
number_cache.locally_owned_dofs. 


I know nothing about hp::DoFHandler, but it also store locally_ownded_dofs in number_cache class, although without any helper Policy class:

template <int dim, int spacedim>
  const IndexSet &
  DoFHandler<dim, spacedim>::locally_owned_dofs() const
  {
    return number_cache.locally_owned_dofs;
  }

and those are set-up in void hp::DoFHandler<dim,spacedim>::distribute_dofs.

Theoretically, i could do dynamic cast there and check if triangulation is of parallel::shared type, 
then i set up those IndexSets appropriately. 

The question is -- will it break something
I don't recall any examples of hp::DoFHandler in parallel on shared triangulations. Anybody does it? 
But hopefully, it is possible and having the right IndexSet for locally_owned_dofs will not break anything. 

I would say, all-in-all, my point is to make:
a) locally_owned_dofs to indeed give locally_owned_dofs even on a shared triangulation with MPI
b)  Triangulation< dim, spacedim >::locally_owned_subdomain() behave nicely and return the proper subdomain (this_mpi_processor) and thus make FETools::interpolate work without re-writing it. 

What do you think? 
 
want to take up that task, you need to leave everything as is (no
artificial cells, dofs are known on all machines). 

that's what i wanted to do for now (point 1 and 2)
 

Timo Heister

unread,
Mar 21, 2014, 9:19:43 AM3/21/14
to dea...@googlegroups.com
> Theoretically, i could do dynamic cast there and check if triangulation is
> of parallel::shared type,
> then i set up those IndexSets appropriately.

Yep, sounds like a good plan.

> The question is -- will it break something?
> I don't recall any examples of hp::DoFHandler in parallel on shared
> triangulations. Anybody does it?

Nope.

> a) locally_owned_dofs to indeed give locally_owned_dofs even on a shared
> triangulation with MPI

Yes, only return what that processor owns but only when using the new
Triangulation type. I guess you need a cast in hp::DoFHandler and a
new policy in DoFHandler.

> b) Triangulation< dim, spacedim >::locally_owned_subdomain() behave nicely
> and return the proper subdomain (this_mpi_processor) and thus make
> FETools::interpolate work without re-writing it.

You mean shared::Triangulation? Yes.

>> want to take up that task, you need to leave everything as is (no
>> artificial cells, dofs are known on all machines).
>
> that's what i wanted to do for now (point 1 and 2)

This sounds like a good first step. Of course this means redundant
work in parallel in a couple of places, so your memory consumption and
runtime won't scale to say to more than 1000 cores and a couple
million unknowns.

Denis Davydov

unread,
Mar 21, 2014, 2:55:11 PM3/21/14
to dea...@googlegroups.com
A working version in the branch. So far FETools::interpolate works (at least does not throw an segfault). 
Locally_owned indexSet is also ok. Have a look and let me know what you think.

Regards,
Denis.

Wolfgang Bangerth

unread,
Mar 24, 2014, 4:06:07 PM3/24/14
to dea...@googlegroups.com

>> Do you propose to derive a policy ParallelShared from Sequential and
>> re-implement some functions (which?)?
>> If everything have to be changed, then one can derive it from PolicyBase, of
>> course, but I hope it is not the case...
>
> Good question. Either you keep track of all cells (and have no
> artificial cells), or you need to base your implementation on the
> parallel one that is used with p4est that ignores artificial cells and
> does some MPI communication. The latter has the advantage that it
> scales a lot better with large number of unknowns.

But for large numbers of unknowns, the shared triangulation approach doesn't
work anyway. In fact I think that you should be able to use the sequential
policy without any modification at all, but it might be a nice feature if you
derived your own policy ParallelShared from Sequential that essentially does
the following:

void ParallelShared::distribute_dofs (...) {
this->Sequential::distribute_dofs (...);
DoFRenumbering::subdomain_wise (...);
}

since this is what one typically wants to do right anyway. I guess this is
essentially your strategy #3 in a following email, where you add another
couple of things after the call to subdomain_wise().

(Note that we allocate memory for all DoFs on all cells we locally store, so
not enumerating them just because they are on a subdomain we don't own on the
current processor doesn't actually save any memory.)

Best
W.


--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/

Wolfgang Bangerth

unread,
Mar 24, 2014, 4:10:10 PM3/24/14
to dea...@googlegroups.com

> For Sequential-like policy I copy-paste-modify
> Sequential<dim,spacedim>::distribute_dofs() and renumber_dofs() to store
> proper values in
> number_cache.locally_owned_dofs.

I don't know if you already committed this, but please don't do copy-paste but
instead create a derived class whose member functions just call the base
class's functions at the top and then compute whatever other data you need.


> The question is -- *will it break something*?
> I don't recall any examples of hp::DoFHandler in parallel on shared
> triangulations. Anybody does it?
> But hopefully, it is possible and having the right IndexSet for
> locally_owned_dofs will not break anything.
>
> I would say, all-in-all, my point is to make:
> a) locally_owned_dofs to indeed give locally_owned_dofs even on a shared
> triangulation with MPI

Yes.

Timo Heister

unread,
Mar 24, 2014, 4:47:11 PM3/24/14
to dea...@googlegroups.com
> But for large numbers of unknowns, the shared triangulation approach doesn't
> work anyway. In fact I think that you should be able to use the sequential
> policy without any modification at all

But then there is no support for hp, which is -- as far as I
understand -- one of the main reasons to do this whole exercise.

Maybe it would make sense to work on integrating hp::DoFHandler into
DoFHandler and the policy first?

Wolfgang Bangerth

unread,
Mar 24, 2014, 5:01:43 PM3/24/14
to dea...@googlegroups.com
On 03/24/2014 03:47 PM, Timo Heister wrote:
>> But for large numbers of unknowns, the shared triangulation approach doesn't
>> work anyway. In fact I think that you should be able to use the sequential
>> policy without any modification at all
>
> But then there is no support for hp, which is -- as far as I
> understand -- one of the main reasons to do this whole exercise.
>
> Maybe it would make sense to work on integrating hp::DoFHandler into
> DoFHandler and the policy first?

Maybe, but it's a lot more work.

I think if you create a parallel::shared::Triangulation and a new policy that
derives from the existing Sequential one, then you're at least not obstructing
anything. If you want to merge the ::DoFHandler and hp::DoFHandler, you'll
simply have to adjust the Sequential policy and the ParallelShared policy will
just continue to work as it did before.

Timo Heister

unread,
Mar 24, 2014, 5:11:31 PM3/24/14
to dea...@googlegroups.com
> I think if you create a parallel::shared::Triangulation and a new policy
> that derives from the existing Sequential one, then you're at least not
> obstructing anything. If you want to merge the ::DoFHandler and
> hp::DoFHandler, you'll simply have to adjust the Sequential policy and the
> ParallelShared policy will just continue to work as it did before.

That is an easy way to get a non-hp DoFHandler up and running with the
new Triangulation, agreed. It does not help with running
hp::DoFHandler with MPI though.

Wolfgang Bangerth

unread,
Mar 24, 2014, 5:19:15 PM3/24/14
to dea...@googlegroups.com
Correct.

I don't have an idea for this, short of putting in a lot of work.

Denis Davydov

unread,
Mar 24, 2014, 5:22:07 PM3/24/14
to dea...@googlegroups.com
On Monday, March 24, 2014 9:06:07 PM UTC+1, Wolfgang Bangerth wrote:
But for large numbers of unknowns, the shared triangulation approach doesn't
work anyway. In fact I think that you should be able to use the sequential
policy without any modification at all, but it might be a nice feature if you
derived your own policy ParallelShared from Sequential that essentially does
the following:

   void ParallelShared::distribute_dofs (...) {
     this->Sequential::distribute_dofs (...);
     DoFRenumbering::subdomain_wise (...);
   }

since this is what one typically wants to do right anyway. I guess this is
essentially your strategy #3 in a following email, where you add another
couple of things after the call to subdomain_wise().

(Note that we allocate memory for all DoFs on all cells we locally store, so
not enumerating them just because they are on a subdomain we don't own on the
current processor doesn't actually save any memory.)

that's exactly what is already there in the dedicated branch. Although there some tricks, but i commented all the decisions. 

Regards,
Denis. 

Denis Davydov

unread,
Mar 24, 2014, 5:23:19 PM3/24/14
to dea...@googlegroups.com
On Monday, March 24, 2014 9:10:10 PM UTC+1, Wolfgang Bangerth wrote:

> For Sequential-like policy I copy-paste-modify
> Sequential<dim,spacedim>::distribute_dofs() and renumber_dofs()  to store
> proper values in
> number_cache.locally_owned_dofs.

I don't know if you already committed this, but please don't do copy-paste but
instead create a derived class whose member functions just call the base
class's functions at the top and then compute whatever other data you need.

that's how it is already. 
 

Denis Davydov

unread,
Mar 24, 2014, 5:33:33 PM3/24/14
to dea...@googlegroups.com
why would there be any problems with hp::DoFHandler? 
There is no policy used in hp::DoFHandler, but I added some dynamic casts to check for triangulation class. 
One thing would be to change:

number_cache.locally_owned_dofs = IndexSet (number_cache.n_global_dofs);

number_cache.locally_owned_dofs.add_range (0, number_cache.n_global_dofs); 


if we deal with parallel::shared::Triangulation.


And another is to add DoFRenumbering::SubdomainWise.


btw, I am was wrong, i did not add renumbering so far. The reason is, it is not strictly speaking, necessary. 

At least for Trilinos vectors this should work with or without renumbering. Thus i kept it outside of the distribute_dofs().

IndexSet of locally_owned DoFs should be correct for either case. 

But adding renumbering is easy, let me know if you think it really should be there. 


Regards,

Denis.

Wolfgang Bangerth

unread,
Mar 24, 2014, 10:40:19 PM3/24/14
to dea...@googlegroups.com

> why would there be any problems with hp::DoFHandler?
> There is no policy used in hp::DoFHandler, but I added some dynamic casts to
> check for triangulation class.

I think Timo was thinking of equipping hp::DoFHandler with policies like we do
with ::DoFHandler, or merging the two altogether.


> One thing would be to change:
>
> number_cache.locally_owned_dofs= IndexSet(number_cache.n_global_dofs);
>
> number_cache.locally_owned_dofs.add_range (0,number_cache.n_global_dofs);
>
>
> if we deal with parallel::shared::Triangulation.

Yes. But that shouldn't be too difficult.


> And another is to add DoFRenumbering::SubdomainWise.

This should almost be trivial. If you see how it's done for ::DoFHandler, you
will immediately see how to do it for hp::DoFHandler. Or even be able to
implement it in a function that works on a generic DH argument.


> btw, I am was wrong, i did not add renumbering so far. The reason is, it is
> not strictly speaking, necessary.
>
> At least for Trilinos vectors this should work with or without renumbering.
> Thus i kept it outside of the distribute_dofs().

Right. But wouldn't it be convenient to do? It's definitely necessary for
PETSc, and it's almost certainly inefficient to not do with Trilinos.

Best
Wolfgang

Denis Davydov

unread,
Mar 25, 2014, 4:33:08 AM3/25/14
to dea...@googlegroups.com
On Tuesday, March 25, 2014 3:40:19 AM UTC+1, Wolfgang Bangerth wrote:

> why would there be any problems with hp::DoFHandler?
> There is no policy used in hp::DoFHandler, but I added some dynamic casts to
> check for triangulation class.

I think Timo was thinking of equipping hp::DoFHandler with policies like we do
with ::DoFHandler, or merging the two altogether.

doing that parallel::shared::Triangulation, i would say there are some problems with using policies. 
Main thing is that it returns IndexSet and thus before you finish with policy some simple things like
DoFHandler.n_dofs() are undefined. Therefore I can not use almost any function like get_subdomain_association() 
within the policy. 

One way out I see is to pass a reference to number_cache so that it does not return it, but rather populate the one belonging to DoFHandler.
Right now I have to do static cast outside of Policy to finish some stuff. 

 


> One thing would be to change:
>
> number_cache.locally_owned_dofs= IndexSet(number_cache.n_global_dofs);
>
> number_cache.locally_owned_dofs.add_range (0,number_cache.n_global_dofs);
>
>
> if we deal with parallel::shared::Triangulation.

Yes. But that shouldn't be too difficult.

So far I leave everything related to hp under Asserts as not-implemented, since i don't use hp now. 
 


> And another is to add DoFRenumbering::SubdomainWise.

This should almost be trivial. If you see how it's done for ::DoFHandler, you
will immediately see how to do it for hp::DoFHandler. Or even be able to
implement it in a function that works on a generic DH argument.


> btw, I am was wrong, i did not add renumbering so far. The reason is, it is
> not strictly speaking, necessary.
>
> At least for Trilinos vectors this should work with or without renumbering.
> Thus i kept it outside of the distribute_dofs().

Right. But wouldn't it be convenient to do? It's definitely necessary for
PETSc, and it's almost certainly inefficient to not do with Trilinos. 

Strange thing is happening if I add

#include <deal.II/dofs/dof_renumbering.h>

to DoFHandler.h (non-hp):

In file included from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/dofs/dof_renumbering.h:25:0,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/dofs/dof_handler.h:31,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/fe/mapping.h:25,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/fe/fe_base.h:30,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/fe/fe.h:22,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/fe_collection.h:22,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_faces.h:22,
                 from /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/source/hp/dof_faces.cc:18:
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:205:41: error: 'FECollection' in namespace 'dealii::hp' does not name a type
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:205:45: error: ISO C++ forbids declaration of 'parameter' with no type [-fpermissive]
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:205:57: error: expected ',' or '...' before '<' token
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:536:11: error: 'FECollection' in namespace 'dealii::hp' does not name a type
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:646:28: error: ISO C++ forbids declaration of 'type name' with no type [-fpermissive]
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/include/deal.II/hp/dof_handler.h:646:84: error: template argument 1 is invalid

don't see what's wrong...

Kind regards,
Denis.

 

Wolfgang Bangerth

unread,
Mar 25, 2014, 9:01:58 AM3/25/14
to dea...@googlegroups.com

> doing that parallel::shared::Triangulation, i would say there are some
> problems with using policies.
> Main thing is that it returns IndexSet and thus before you finish with policy
> some simple things like
> DoFHandler.n_dofs() are undefined. Therefore I can not use almost any function
> like get_subdomain_association()
> within the policy.

Ah, right. Was that the reason why you didn't want to call
DoFRenumbering::component_wise from the policy?


> One way out I see is to pass a reference to number_cache so that it does not
> return it, but rather populate the one belonging to DoFHandler.

I think that's not unreasonable.


> So far I leave everything related to hp under Asserts as not-implemented,
> since i don't use hp now.

OK.
You have a circular list of header file inclusions (fe_collection.h includes
fe.h which includes mapping.h which includes dof_handler.h which includes
dof_renumbering.h which needs hp/dof_handler.h which needs fe_collection.h).
I've made a couple of modifications on mainline for this. Can you do a merge
from mainline to your branch and try again?

Best
W.

Denis Davydov

unread,
Mar 25, 2014, 2:19:18 PM3/25/14
to dea...@googlegroups.com
On Tuesday, March 25, 2014 2:01:58 PM UTC+1, Wolfgang Bangerth wrote:

> doing that parallel::shared::Triangulation, i would say there are some
> problems with using policies.
> Main thing is that it returns IndexSet and thus before you finish with policy
> some simple things like
> DoFHandler.n_dofs() are undefined. Therefore I can not use almost any function
> like get_subdomain_association()
> within the policy.

Ah, right. Was that the reason why you didn't want to call
DoFRenumbering::component_wise from the policy?

no, i found out about this problem later.
 


> One way out I see is to pass a reference to number_cache so that it does not
> return it, but rather populate the one belonging to DoFHandler.

I think that's not unreasonable.

will give it a try...
 

You have a circular list of header file inclusions (fe_collection.h includes
fe.h which includes mapping.h which includes dof_handler.h which includes
dof_renumbering.h which needs hp/dof_handler.h which needs fe_collection.h).
I've made a couple of modifications on mainline for this. Can you do a merge
from mainline to your branch and try again? 

I am not a big user of svn, while trying to merge i got some errors:

$svn merge ^/trunk
svn: E195016: Merge tracking not allowed with missing subtrees; try restoring these items first:
/Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/doc/users/Config.sample
$ svn status
!       deal.II/doc/users/Config.sample

$ svn revert deal.II/doc/users/Config.sample
Reverted 'deal.II/doc/users/Config.sample'
$ svn status
!       deal.II/doc/users/config.sample
 

Can you please merge the branch? I guess it is easier than looking at this issue with Config.sample/ config.sample.

Best regards,
Denis.

Wolfgang Bangerth

unread,
Mar 25, 2014, 3:12:51 PM3/25/14
to dea...@googlegroups.com

> I am not a big user of svn, while trying to merge i got some errors:
>
> $svn merge ^/trunk
> svn: E195016: Merge tracking not allowed with missing subtrees; try
> restoring these items first:
> /Users/davydden/libs-sources/dealii/deal.II-trunk/deal.II/doc/users/Config.sample
>
> $ svn status
> ! deal.II/doc/users/Config.sample
>
> $ svn revert deal.II/doc/users/Config.sample
> Reverted 'deal.II/doc/users/Config.sample'
> $ svn status
> ! deal.II/doc/users/config.sample

Timo -- did you just branch the /trunk/deal.II directory or all of /trunk?

Timo Heister

unread,
Mar 25, 2014, 3:46:34 PM3/25/14
to dea...@googlegroups.com
No:
$ svn ls https://svn.dealii.org/branches/branch_sharedtria/
deal.II/
tests/

But maybe Denis did not check out the whole repository or ran the
command inside "deal.II"?
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> --- You received this message because you are subscribed to the Google
> Groups "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Denis Davydov

unread,
Mar 25, 2014, 3:50:26 PM3/25/14
to dea...@googlegroups.com
On Tuesday, March 25, 2014 8:46:34 PM UTC+1, Timo Heister wrote:

But maybe Denis did not check out the whole repository or ran the
command inside "deal.II"?


i don't know what is the problem, but i did checkout the whole repository. 
I switched to the trunk now (to look at laplace_transfrom) and still see the same problem:

$ svn status
!       deal.II/doc/users/Config.sample 

no clue what's wrong here... i definitely did not touch that file myself. 

Timo Heister

unread,
Mar 25, 2014, 4:04:30 PM3/25/14
to dea...@googlegroups.com
Just revert the file.

If you want, I can do the merge from mainline for you.

Denis Davydov

unread,
Mar 25, 2014, 4:16:13 PM3/25/14
to dea...@googlegroups.com
On Tuesday, March 25, 2014 9:04:30 PM UTC+1, Timo Heister wrote:
Just revert the file.

just reverting the file just does not work :)

$ svn status
!       deal.II/doc/users/Config.sample
$ svn revert deal.II/doc/users/Config.sample
Reverted 'deal.II/doc/users/Config.sample'
$ svn status
!       deal.II/doc/users/config.sample
$ svn revert deal.II/doc/users/config.sample
Reverted 'deal.II/doc/users/config.sample'
$ svn status
!       deal.II/doc/users/Config.sample


tried it before... 


If you want, I can do the merge from mainline for you.

Please, do so.

Timo Heister

unread,
Mar 25, 2014, 4:17:35 PM3/25/14
to dea...@googlegroups.com
you can try "rm -rf deal.II/doc/users" and then "svn up"

>> If you want, I can do the merge from mainline for you.
>
> Please, do so.

done.

Denis Davydov

unread,
Mar 25, 2014, 4:24:51 PM3/25/14
to dea...@googlegroups.com
On Tuesday, March 25, 2014 9:17:35 PM UTC+1, Timo Heister wrote:
you can try "rm -rf deal.II/doc/users" and then "svn up"

I think it is related to OSX somehow or to the svn I have (1.8.8), when i fully remove users folder, in svn status i see both files:

!       deal.II/doc/users/Config.sample
!       deal.II/doc/users/config.sample

however when I do svn up, only one of them is updated:
$ svn up
Updating '.':
Restored 'deal.II/doc/users'
Restored 'deal.II/doc/users/title.html'
Restored 'deal.II/doc/users/Config.sample'
Restored 'deal.II/doc/users/toc.html'
Restored 'deal.II/doc/users/index.html'
Restored 'deal.II/doc/users/navbar.html'
Restored 'deal.II/doc/users/CMakeLists.txt.sample'
Restored 'deal.II/doc/users/CMakeLists.txt.sample2'
Restored 'deal.II/doc/users/CMakeLists.txt.sample3'
Restored 'deal.II/doc/users/doxygen.html'
Restored 'deal.II/doc/users/cmakelists.html'
Restored 'deal.II/doc/users/cmake.html'
 

>> If you want, I can do the merge from mainline for you.
>
> Please, do so.

done.
 
thanks! 

Best regards,
Denis

Denis Davydov

unread,
Mar 25, 2014, 5:24:43 PM3/25/14
to dea...@googlegroups.com
I updated the branch. DoFRenumbering and calculation of locally_owned_dofs are done inside the policy now.
Please, have a look.

Regards,
Denis.

Denis Davydov

unread,
Mar 25, 2014, 6:56:49 PM3/25/14
to dea...@googlegroups.com
I modified step-18 to enjoy parallel::shared::Triangulation, works fine (attached).

Regards,
Denis.
step-18.cc

Denis Davydov

unread,
Mar 26, 2014, 5:30:26 AM3/26/14
to dea...@googlegroups.com
A small note. Currently i do not calculate IndexSets of locally_owned_dofs_per_processor.
But, if we stick with renumbering done internally, this could be trivially added based on 
n_locally_owned_dofs_per_processor(). I can also think of a more general case to use 
based on DoFTools::get_subdomain_association(), as an extension of what i already do for a single processor.

Let me know what you think about it.

Regards,
Denis.

Denis Davydov

unread,
Apr 25, 2014, 1:21:36 AM4/25/14
to dea...@googlegroups.com
Dear all,

In an attempt to revive the topic, 
i would like to say that some basic documentation was added. 
The branch is working and waits to be reviewed and 
hopefully merged to trunk :D

Regards,
Denis.

Timo Heister

unread,
Apr 25, 2014, 10:26:50 AM4/25/14
to dea...@googlegroups.com
Denis,

> In an attempt to revive the topic,
> i would like to say that some basic documentation was added.
> The branch is working and waits to be reviewed and
> hopefully merged to trunk :D

I am sorry. I had too much on my plate recently.

Before we go about merging:
1) a number of tests in the testsuite are broken:
example: mpi/renumber_z_order_01.mpirun=4.debug

3928: An error occurred in line <2526> of file
</ssd/branch_sharedtria/deal.II/source/dofs/dof_handler_policy.cc> in
function
3928: virtual void
dealii::internal::DoFHandler::Policy::ParallelDistributed<2,
2>::renumber_dofs(const std::vector<types::global_dof_index> &,
dealii::DoFHandler<dim, spacedim> &,
dealii::internal::DoFHandler::NumberCache &) const [dim = 2, spacedim
= 2]
3928: The violated condition was:
3928: number_cache.locally_owned_dofs.n_elements() == new_numbers.size()
3928: The name and call sequence of the exception was:
3928: ExcInternalError()

Can you take a look at those please?

(I submitted the results to cdash, but cdash decided to not show the
test results)

Let me know if you need help running the testsuite.

2) We need to write new tests. I will start working on this a bit today.

3) We should replace step-18 (and update it's documentation).

Timo Heister

unread,
Apr 25, 2014, 10:34:41 AM4/25/14
to dea...@googlegroups.com
> (I submitted the results to cdash, but cdash decided to not show the
> test results)

correction, here are the results (ignore the tests in fe/*):
http://cdash.kyomu.43-1.org/viewTest.php?onlyfailed&buildid=5545

Denis Davydov

unread,
Apr 25, 2014, 10:38:06 AM4/25/14
to dea...@googlegroups.com
Hi Timo,

Thanks for looking at it.


On Friday, April 25, 2014 4:34:41 PM UTC+2, Timo Heister wrote:
> (I submitted the results to cdash, but cdash decided to not show the
> test results)

correction, here are the results (ignore the tests in fe/*):
http://cdash.kyomu.43-1.org/viewTest.php?onlyfailed&buildid=5545


That is strange, in theory my modifications should not affect anything. 
I will look closely at those tests tonight to see what causes problems.

Regards,
Denis.  

Timo Heister

unread,
Apr 25, 2014, 5:38:53 PM4/25/14
to dea...@googlegroups.com
Denis,

I started working on tests: take a look at tests/sharedtria/tria_01. I
haven't worked on the DoFHandler yet.

I also moved the shared::Triangulation stuff into separate files.

Should we add things like:
n_locally_owned_active_cells(), n_global_active_cells(),
n_locally_owned_active_cells_per_processor()
or not? This would simplify step-18 so we don't need
GridTools::count_cells_with_subdomain_association().

I put in a new step-18 based on your work.
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



Wolfgang Bangerth

unread,
Apr 25, 2014, 7:00:55 PM4/25/14
to dea...@googlegroups.com
On 04/25/2014 04:38 PM, Timo Heister wrote:
> Denis,
>
> I started working on tests: take a look at tests/sharedtria/tria_01. I
> haven't worked on the DoFHandler yet.
>
> I also moved the shared::Triangulation stuff into separate files.
>
> Should we add things like:
> n_locally_owned_active_cells(), n_global_active_cells(),
> n_locally_owned_active_cells_per_processor()
> or not? This would simplify step-18 so we don't need
> GridTools::count_cells_with_subdomain_association().

Yes, definitely.

Denis Davydov

unread,
Apr 27, 2014, 4:26:10 AM4/27/14
to dea...@googlegroups.com
Hi Timo,

I think i fixed this problem, so far 
The following tests FAILED:
4226 - gla/vec_05.mpirun=4.debug (Failed)
4227 - gla/vec_05.mpirun=4.release (Failed)
5295 - mpi/codim_01.mpirun=3.debug (Failed)
5505 - mpi/multigrid_adaptive.mpirun=1.debug (Failed)
5506 - mpi/multigrid_adaptive.mpirun=1.release (Failed)
5515 - mpi/no_flux_constraints_02.mpirun=1.debug (Failed)
5516 - mpi/no_flux_constraints_02.mpirun=1.release (Failed)
5517 - mpi/no_flux_constraints_02.mpirun=9.debug (Failed)
5518 - mpi/no_flux_constraints_02.mpirun=9.release (Failed)
5519 - mpi/no_flux_constraints_03.mpirun=1.debug (Failed)
5520 - mpi/no_flux_constraints_03.mpirun=1.release (Failed)
5521 - mpi/no_flux_constraints_03.mpirun=9.debug (Failed)
5522 - mpi/no_flux_constraints_03.mpirun=9.release (Failed)
5757 - mpi/periodicity_01.mpirun=3.debug (Failed)
5758 - mpi/periodicity_01.mpirun=3.release (Failed)
5759 - mpi/periodicity_01.mpirun=5.debug (Failed)
5760 - mpi/periodicity_01.mpirun=5.release (Failed)
5761 - mpi/periodicity_01.mpirun=7.debug (Failed)
5762 - mpi/periodicity_01.mpirun=7.release (Failed)
5909 - mpi/step-40.mpirun=10.debug (Failed)
5910 - mpi/step-40.mpirun=10.release (Failed)
5911 - mpi/step-40.mpirun=3.debug (Failed)
5912 - mpi/step-40.mpirun=3.release (Failed)
5913 - mpi/step-40.mpirun=4.debug (Failed)
5914 - mpi/step-40.mpirun=4.release (Failed)
5915 - mpi/step-40_direct_solver.mpirun=10.debug (Failed)
5916 - mpi/step-40_direct_solver.mpirun=10.release (Failed)
5917 - mpi/step-40_direct_solver.mpirun=3.debug (Failed)
5918 - mpi/step-40_direct_solver.mpirun=3.release (Failed)
5919 - mpi/step-40_direct_solver.mpirun=4.debug (Failed)
5920 - mpi/step-40_direct_solver.mpirun=4.release (Failed)

i will have a look if those are related as well...
and then will have a loot at those n_locally_owned_active_cells(), etc

Regards,
Denis.

Denis Davydov

unread,
Apr 27, 2014, 12:46:41 PM4/27/14
to dea...@googlegroups.com
update. It seems those tests are not related to the branch:


On Sunday, April 27, 2014 10:26:10 AM UTC+2, Denis Davydov wrote:
4226 - gla/vec_05.mpirun=4.debug (Failed)
4227 - gla/vec_05.mpirun=4.release (Failed)
those two do not seem to be related to my modifications. They do not use Triangulations at all.
 
5295 - mpi/codim_01.mpirun=3.debug (Failed)
here failing diff is something like:

< <VTKFile type="PUnstructuredGrid" version="0.1" byte_order="LittleEndian">
---
> <VTKFile type="PUnstructuredGrid" version="0.1" compressor="vtkZLibDataCompressor" byte_order="LittleEndian">
7c7
<       <PDataArray type="Float32" Name="Position" NumberOfComponents="3" format="ascii"/>
---
>       <PDataArray type="Float32" Name="Position" NumberOfComponents="3" format="binary"/>
 
etc. For whatever reason there is an output mismatch (ascii/binary).
 
5505 - mpi/multigrid_adaptive.mpirun=1.debug (Failed)
5506 - mpi/multigrid_adaptive.mpirun=1.release (Failed)

here is some accuracy thing:

67c67
< DEAL:cg::Convergence step 5 value 1.65910e-17
---
> DEAL:cg::Convergence step 5 value 1.65911e-17

 
5915 - mpi/step-40_direct_solver.mpirun=10.debug (Failed)
5916 - mpi/step-40_direct_solver.mpirun=10.release (Failed)
5917 - mpi/step-40_direct_solver.mpirun=3.debug (Failed)
5918 - mpi/step-40_direct_solver.mpirun=3.release (Failed)
5919 - mpi/step-40_direct_solver.mpirun=4.debug (Failed)
5920 - mpi/step-40_direct_solver.mpirun=4.release (Failed)
those fail because i don't have Petsc compiled with MUMPS. 

Could you please check if the current state of the branch passes all tests on your machine?

Best regards,
Denis.

Denis Davydov

unread,
Apr 27, 2014, 1:05:05 PM4/27/14
to dea...@googlegroups.com


> Should we add things like:
> n_locally_owned_active_cells(),  n_global_active_cells(),
> n_locally_owned_active_cells_per_processor()
> or not? This would simplify step-18 so we don't need
> GridTools::count_cells_with_subdomain_association().

Yes, definitely.
W.

i looked at this issue. 
What needed is, pretty much, a copy-paste of distributed Triangulation<dim,spacedim>::update_number_cache ().

I don't think copy-paste is nice, but so far i don't see how to use the same chunk of code for shared triangulation. 
Do you have any ideas?

Best regards,
Denis. 

 

Timo Heister

unread,
Apr 27, 2014, 6:46:40 PM4/27/14
to dea...@googlegroups.com
> Could you please check if the current state of the branch passes all tests on your machine?

I just set up the job to run and it will appear later today on the cdash.

> i looked at this issue.
> What needed is, pretty much, a copy-paste of distributed
> Triangulation<dim,spacedim>::update_number_cache ().
>
> I don't think copy-paste is nice, but so far i don't see how to use the same
> chunk of code for shared triangulation.
> Do you have any ideas?

You could introduce a new base class of parallel::Tria and
shared::Tria that is derived from Triangulation and implements the
interface. Note that you need to fill the number cache differently in
shared and parallel though.

Timo Heister

unread,
Apr 28, 2014, 7:42:47 AM4/28/14
to dea...@googlegroups.com
>> Could you please check if the current state of the branch passes all tests on your machine?
>
> I just set up the job to run and it will appear later today on the cdash.

looks like everything is good now:
http://cdash.kyomu.43-1.org/buildSummary.php?buildid=5639

Timo Heister

unread,
Apr 29, 2014, 6:44:37 PM4/29/14
to dea...@googlegroups.com
Hey Denis,

some more things to do for you, sorry. :-)

./ We need a constructor of the form
Triangulation (MPI_Comm mpi_communicator,
const typename
dealii::Triangulation<dim,spacedim>::MeshSmoothing)

./ Something is wrong in the number cache. In test sharedtria/dof_01 this fails:
Assert(dof_handler.n_locally_owned_dofs() ==
dof_handler.n_locally_owned_dofs_per_processor()[triangulation.locally_owned_subdomain()],
ExcInternalError());

./ The number of DoFs changes with a different number of CPUs in the
test sharedtria/dof_01, which shouldn't be the case.

Timo Heister

unread,
Apr 29, 2014, 7:02:20 PM4/29/14
to dea...@googlegroups.com
also tests/sharedtria/dof_02 shows that renumbering is broken with #CPUs>1:

4563: An error occurred in line <536> of file
</ssd/branch_sharedtria/deal.II/source/dofs/dof_renumbering.cc> in
function
4563: void dealii::DoFRenumbering::component_wise(DoFHandler<dim,
spacedim> &, const std::vector<unsigned int> &) [dim = 2, spacedim =
2]
4563: The violated condition was:
4563: (result == dof_handler.n_locally_owned_dofs()) ||
((dof_handler.n_locally_owned_dofs() < dof_handler.n_dofs()) &&
(result <= dof_handler.n_dofs()))
4563: The name and call sequence of the exception was:
4563: ExcRenumberingIncomplete()
4563: Additional Information:
4563: (none)

Denis Davydov

unread,
May 2, 2014, 3:04:59 AM5/2/14
to dea...@googlegroups.com
Hi Timo,

i will look at those issues soon....


On Wednesday, April 30, 2014 12:44:37 AM UTC+2, Timo Heister wrote:
./ Something is wrong in the number cache. In test sharedtria/dof_01 this fails:
      Assert(dof_handler.n_locally_owned_dofs() ==
dof_handler.n_locally_owned_dofs_per_processor()[triangulation.locally_owned_subdomain()],
ExcInternalError());


it's not broken, n_locally_owned_dofs_per_processor is not implemented yet :)

Regards,
Denis.
 

Denis Davydov

unread,
May 2, 2014, 8:14:41 AM5/2/14
to dea...@googlegroups.com
Hi Timo,

I think dofs_01 output should be updated to the current output of the tests. 
Please, have a look. 

I also updated the constructor. 
locally_owned_dofs_per_processor and their numbers is also in number_cache now.

tria_01 passes.

The only problem i have is with dofs_02 where you call DoFRenumbering::component_wise(dof_handler) for 3 processors. 
I don't have the error you posted now, but a SegError...
Since there are some problems with Gdb on os-x, i would appreciate if you could also have a 
look at this test, perhaps you would notice what could be wrong when renumbering is involved...
Note, that there DoFRenumbering::subdomain_wise (dof_handler) within the distribute_dofs of the corresponding policy class. 
Any problems with those renumbering calling one after another?

Regards,
Denis. 

Timo Heister

unread,
May 2, 2014, 11:05:36 AM5/2/14
to dea...@googlegroups.com
this is what happens in dofs_02 (I don't have time to work on this
more over the next couple of days, but this might help):

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff3aceba8 in
dealii::DoFTools::get_subdomain_association<dealii::DoFHandler<2, 2> >
(dof_handler=...,
subdomain_association=std::vector of length 818, capacity 818 = {...})
at /ssd/branch_sharedtria/deal.II/source/dofs/dof_tools.cc:1172
1172 if (subdomain_association[local_dof_indices[i]] ==
(gdb) bt
#0 0x00007ffff3aceba8 in
dealii::DoFTools::get_subdomain_association<dealii::DoFHandler<2, 2> >
(dof_handler=...,
subdomain_association=std::vector of length 818, capacity 818 = {...})
at /ssd/branch_sharedtria/deal.II/source/dofs/dof_tools.cc:1172
#1 0x00007ffff3acf97c in
dealii::DoFTools::locally_owned_dofs_with_subdomain<dealii::DoFHandler<2,
2> > (
dof_handler=...) at
/ssd/branch_sharedtria/deal.II/source/dofs/dof_tools.cc:1084
#2 0x00007ffff39a2c0e in
dealii::internal::DoFHandler::Policy::ParallelShared<2,
2>::renumber_dofs (this=0x725050,
new_numbers=std::vector of length 232, capacity 232 = {...},
dof_handler=..., number_cache_current=...)
at /ssd/branch_sharedtria/deal.II/include/deal.II/dofs/dof_handler_policy.h:1011
#3 0x00007ffff394ac26 in dealii::DoFHandler<2, 2>::renumber_dofs
(this=0x7fffffffb6a0,
new_numbers=std::vector of length 232, capacity 232 = {...})
at /ssd/branch_sharedtria/deal.II/include/deal.II/dofs/dof_handler.h:1289
#4 0x00007ffff39e1c33 in dealii::DoFRenumbering::component_wise<2, 2>
(dof_handler=...,
component_order_arg=std::vector of length 0, capacity 0)
at /ssd/branch_sharedtria/deal.II/source/dofs/dof_renumbering.cc:538
#5 0x0000000000423ca9 in test<2> () at
/ssd/branch_sharedtria/tests/sharedtria/dof_02.cc:95
#6 0x0000000000422d02 in main (argc=1, argv=0x7fffffffc948) at
/ssd/branch_sharedtria/tests/sharedtria/dof_02.cc:147
(gdb) bt
#0 0x00007ffff3aceba8 in
dealii::DoFTools::get_subdomain_association<dealii::DoFHandler<2, 2> >
(dof_handler=...,
subdomain_association=std::vector of length 818, capacity 818 = {...})
at /ssd/branch_sharedtria/deal.II/source/dofs/dof_tools.cc:1172
#1 0x00007ffff3acf97c in
dealii::DoFTools::locally_owned_dofs_with_subdomain<dealii::DoFHandler<2,
2> > (
dof_handler=...) at
/ssd/branch_sharedtria/deal.II/source/dofs/dof_tools.cc:1084
#2 0x00007ffff39a2c0e in
dealii::internal::DoFHandler::Policy::ParallelShared<2,
2>::renumber_dofs (this=0x725050,
new_numbers=std::vector of length 232, capacity 232 = {...},
dof_handler=..., number_cache_current=...)
at /ssd/branch_sharedtria/deal.II/include/deal.II/dofs/dof_handler_policy.h:1011
#3 0x00007ffff394ac26 in dealii::DoFHandler<2, 2>::renumber_dofs
(this=0x7fffffffb6a0,
new_numbers=std::vector of length 232, capacity 232 = {...})
at /ssd/branch_sharedtria/deal.II/include/deal.II/dofs/dof_handler.h:1289
#4 0x00007ffff39e1c33 in dealii::DoFRenumbering::component_wise<2, 2>
(dof_handler=...,
component_order_arg=std::vector of length 0, capacity 0)
at /ssd/branch_sharedtria/deal.II/source/dofs/dof_renumbering.cc:538
#5 0x0000000000423ca9 in test<2> () at
/ssd/branch_sharedtria/tests/sharedtria/dof_02.cc:95
#6 0x0000000000422d02 in main (argc=1, argv=0x7fffffffc948) at
/ssd/branch_sharedtria/tests/sharedtria/dof_02.cc:147
(gdb) p i
$1 = 22
(gdb) p local_dof_indices
$2 = std::vector of length 36, capacity 36 = {0, 528, 346, 347, 0, 0,
33, 289, 290, 293, 295, 297, 298, 301, 302, 291,
305, 306, 309, 310, 0, 311, 7596592, 312, 293, 295, 321, 323, 325,
326, 329, 330, 305, 306, 333, 18337}
(gdb) p subdomain_association
$3 = std::vector of length 818, capacity 818 = {2, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 2,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 2,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295, 4294967295,
4294967295, 4294967295, 4294967295...}
(gdb) p local_dof_indices[i]
$4 = (unsigned int &) @0x7436b8: 7596592
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



Denis Davydov

unread,
May 2, 2014, 3:00:40 PM5/2/14
to dea...@googlegroups.com
Hi Timo,

thanks for posting this. 
It is clear that local_dof_indices[i] for particular cell is more than number of dofs: 7596592 > 818
this SIGSEGV at subdomain_association[local_dof_indices[i]]. 

Why this happens - is another question. 
I will see if i can trigger this outside of shared triangulation...

Regards,
Denis.

Denis Davydov

unread,
May 4, 2014, 6:36:08 AM5/4/14
to dea...@googlegroups.com
I think i know what causes the problem.

For subdomain-wise renumbering, the  renumbering vector is initialised with n_dofs():

  template <class DH>

  void

  subdomain_wise (DH &dof_handler)

  {

    std::vector<types::global_dof_index> renumbering(dof_handler.n_dofs(),

                                                     DH::invalid_dof_index);

    compute_subdomain_wise(renumbering, dof_handler);

    dof_handler.renumber_dofs(renumbering);

  }



Whereas for component wise it is n_locally_owned_dofs:

  template <int dim, int spacedim>

  void

  component_wise (DoFHandler<dim,spacedim>        &dof_handler,

                  const std::vector<unsigned int> &component_order_arg)

  {

    std::vector<types::global_dof_index> renumbering (dof_handler.n_locally_owned_dofs(),

                                                      DoFHandler<dim>::invalid_dof_index);

    ...

   dof_handler.renumber_dofs (renumbering);

  }


Of course, for usual triangulation this does not matter as NumberCache is kept independent on MPI processors and

n_dofs()==n_locally_owned_dofs();

For the new class, however, the two are different! 


Without going into many lines of code, I assume for Sequential<dim,spacedim>::renumber_dofs, renumbering should be of n_dofs(), right? 

Are there any cases in the library when it is called with  renumbering.size() < n_dofs()?

Hopefully what is written for component_wise renumbering also make sense when n_locally_owned_dofs < n_dofs?

I suppose that's exactly what is called for distributed triangulation, right?


So it seems to me that in order to safely use Sequential renumbering within the new class, 

those "renumbering" should be gathered among processors. 

Can i gather them as-is, or i need to be more careful about it? 


Regards,

Denis.


Timo Heister

unread,
May 4, 2014, 10:29:10 AM5/4/14
to dea...@googlegroups.com
> Without going into many lines of code, I assume for
> Sequential<dim,spacedim>::renumber_dofs, renumbering should be of n_dofs(),
> right?

Yes.

> Are there any cases in the library when it is called with
> renumbering.size() < n_dofs()?

It is n_locally_owned_dofs() for a parallel::distributed mesh.

> So it seems to me that in order to safely use Sequential renumbering within
> the new class,
>
> those "renumbering" should be gathered among processors.
>
> Can i gather them as-is, or i need to be more careful about it?

Because all the unknowns are known on each processor, you should
handle it exactly as in the serial case (use n_dofs()). This way no
communication is needed.

Denis Davydov

unread,
May 4, 2014, 10:59:15 AM5/4/14
to dea...@googlegroups.com
On Sunday, May 4, 2014 4:29:10 PM UTC+2, Timo Heister wrote:

> So it seems to me that in order to safely use Sequential renumbering within
> the new class,
>
> those "renumbering" should be gathered among processors.
>
> Can i gather them as-is, or i need to be more careful about it?

Because all the unknowns are known on each processor, you should
handle it exactly as in the serial case (use n_dofs()). This way no
communication is needed.

Ok, then i add a number of static casts check in dof_renumbering.cc to decide whether 
n_dofs() (shared) or n_locally_owned_dofs() (distributed) should be used...

Regards,
Denis.
 

Denis Davydov

unread,
May 4, 2014, 2:49:57 PM5/4/14
to dea...@googlegroups.com


On Sunday, May 4, 2014 4:29:10 PM UTC+2, Timo Heister wrote:

Because all the unknowns are known on each processor, you should
handle it exactly as in the serial case (use n_dofs()). This way no
communication is needed.


it seems it is not so trivial.
For example, in component_wise renumbering there is:

          if (is_level_operation)

          {

            //we are dealing with mg dofs, skip foreign level cells:

            if ((start->get_dof_handler().get_tria().locally_owned_subdomain() != numbers::invalid_subdomain_id)

                &&

                (cell->level_subdomain_id()!=start->get_dof_handler().get_tria().locally_owned_subdomain()))

              continue;

          }

        else

          {

            //we are dealing with active dofs, skip the loop if not locally

            // owned:

            if (!cell->active() || !cell->is_locally_owned())

              continue;

          }


All those is_locally_owned will return values consistent with distributed case. 
Perhaps, other algorithms are also strongly build on such assumptions. 

Now it seems to me that it is easier to do some communication inside renumbering policy class a-la distributed case. 
Simple dynamic casts with n_dofs() vs n_locally_owned_dofs()  don't help and lead to other errors even within component_wise renumbering, not to mention others which i did not try.

What do you think?

Regards,
Denis.

Timo Heister

unread,
May 6, 2014, 12:37:08 AM5/6/14
to dea...@googlegroups.com
> All those is_locally_owned will return values consistent with distributed
> case.
> Perhaps, other algorithms are also strongly build on such assumptions.

Likely, yes. Sorry, I wasn't thinking about this.

> Now it seems to me that it is easier to do some communication inside
> renumbering policy class a-la distributed case.

Not sure. Because every process knows all DoFs, you would need to
transfer every new DoF to every other process.

Denis Davydov

unread,
May 7, 2014, 12:58:16 PM5/7/14
to dea...@googlegroups.com
On Tuesday, May 6, 2014 6:37:08 AM UTC+2, Timo Heister wrote:
> Now it seems to me that it is easier to do some communication inside
> renumbering policy class a-la distributed case.

Not sure. Because every process knows all DoFs, you would need to
transfer every new DoF to every other process.


every process knows all DoFs, but new numbers are sent only for locally_owned_dofs. 
And those numbers are not global, as far as i understand. 

Can i rebuild association: current_dofs <- numbers based on:
a) loop over all active cells;
b) if cell is locally owned;
c) get local DoFs and check which of them belong to current IndexSet locally_owned_dofs(). 

If on each processor I could relate  new numbers with particular dofs they belong, 
i could rather easy gather those numbers into a single vector of size n_dofs() and proceed with 
usual renumbering...
 

Timo Heister

unread,
May 7, 2014, 9:10:53 PM5/7/14
to dea...@googlegroups.com
I guess you would then need to:
./ create a vector<dofindex> of length locally_owned.n_elements() and
copy the entries over
./ send this vector to each other process
./ create a vector<dofindex> of size n_dofs
./ receive n_proc vectors and copy the indices into the vector above
using the locally_owned_per_cpu IndexSets
./ hand the vector to sequential::renumbering
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



Denis Davydov

unread,
May 15, 2014, 10:57:38 AM5/15/14
to dea...@googlegroups.com
Thanks, Timo.

I implemented this (last commit in the branch), 
Could you please have a quick look to see if that's what you meant?

Yet, it seems something is wrong... :(
I attach a simple test with 4 elements in 2d, where one could examine numbers directly.

Regards,
Denis.
problem_all.cc

Denis Davydov

unread,
May 15, 2014, 11:33:55 AM5/15/14
to dea...@googlegroups.com
small updated:

I printed out the input "new_numbers" for component_wise renumbering (two processors), 
somewhat they don't look as they will give a proper numbers (2 element for each processor, (FE_Q<dim>(1), 1, FE_Q<dim>(1), 1)) :
   process 0:
   0 3 4 1 5 6 2 7 8
   process 1:
   0 1 2 6 3 7 4 5 8

In turn, dofs on 1st element change from:
0 1 9 2 3 4 10 5
to
0 3 9 4 1 5 10 6

Either I misunderstand what that input vector with new_numbers mean, 
or something is wrong during its preparation...

Attached a slightly modified file with proper parallel output.

Regards,
Denis.
problem_all.cc

Denis Davydov

unread,
May 15, 2014, 1:07:33 PM5/15/14
to dea...@googlegroups.com
I think i know what's the problem. 
In dof renumbering, there are numerous checks like:

 if (const parallel::distributed::Triangulation<dim,spacedim> *tria

        = (dynamic_cast<const parallel::distributed::Triangulation<dim,spacedim>*>

           (&start->get_dof_handler().get_tria())))

      


      

so dof_renumbering.cc as-is will not work at all, irrespectively of what is inside Policy class. 


Everything what is inside those if's, as far as i can see, 

applies to parallel::shared::Triangulation as well. 


How about yet another class from which both distributed and shared are derived, say parallel::Triangulation<> ?

It can be abstract and only provide abstract common methods which both shared and distributed implement. 

And then used only for casting in such situations. 


Any better solution?

Denis Davydov

unread,
May 16, 2014, 3:23:15 PM5/16/14
to dea...@googlegroups.com
I went ahead and implemented the idea i described in the last message;

Component wise renumbering works now for the simple example above,
have not looked at tests yet.

Timo, could you please run the full test suite to see if parallel::distributed::Triangulation still works,
and also have a quick look of my modifications. 
If everything is all right, i will have a look at other renumbering in dofs/dof_renumbering.cc 
to change old dynamic casts to new ones.

Regards,
Denis.

Timo Heister

unread,
May 16, 2014, 3:59:37 PM5/16/14
to dea...@googlegroups.com
Hey Denis,

> How about yet another class from which both distributed and shared are derived, say parallel::Triangulation<> ?

Good idea.

> Timo, could you please run the full test suite to see if
> parallel::distributed::Triangulation still works,

it is running now.

> and also have a quick look of my modifications.

will do (but it might take a little longer, I am traveling again). Is
any of the renumbering working already?

Denis Davydov

unread,
May 16, 2014, 6:21:16 PM5/16/14
to dea...@googlegroups.com
Hi Timo,


On Friday, May 16, 2014 9:59:37 PM UTC+2, Timo Heister wrote:
> and also have a quick look of my modifications.

will do (but it might take a little longer, I am traveling again). Is
any of the renumbering working already?

component wise works on my test example of 4 cells and fe_system(FE_Q<dim>(1), 1, FE_Q<dim>(1), 1);
(file is attached above).

Will be waiting for the test suite results...

Regards,
Denis
 
 

Timo Heister

unread,
May 17, 2014, 9:02:16 PM5/17/14
to dea...@googlegroups.com
> Will be waiting for the test suite results...

some of the shared tria stuff is not compiling with clang. I will take
a look at it:

http://cdash.kyomu.43-1.org/index.php?project=deal.II&subproject=branch_sharedtria

Timo Heister

unread,
May 19, 2014, 4:38:14 PM5/19/14
to dea...@googlegroups.com

Denis Davydov

unread,
May 20, 2014, 4:38:42 AM5/20/14
to dea...@googlegroups.com

> Will be waiting for the test suite results...

some of the shared tria stuff is not compiling with clang. I will take
a look at it:

http://cdash.kyomu.43-1.org/index.php?project=deal.II&subproject=branch_sharedtria

damn it... :)

i will have a look and see if i can trigger it on smaller meshes/test cases....
funny enough it did work on 4 element mesh, which makes things more tricky to debug...

Denis Davydov

unread,
Jul 11, 2014, 4:15:31 AM7/11/14
to dea...@googlegroups.com


On Tuesday, May 20, 2014 10:38:42 AM UTC+2, Denis Davydov wrote:
i will have a look and see if i can trigger it on smaller meshes/test cases....
funny enough it did work on 4 element mesh, which makes things more tricky to debug...

i am able to trigger this problem on a small mesh (1 global refinement + 1 cell locally refined). 
Looking at it now to check whether my Asserts are too conservative or something is indeed wrong... 

Denis Davydov

unread,
Jul 11, 2014, 6:00:52 AM7/11/14
to dea...@googlegroups.com
I found the problem but I do not understand what is wrong.

In renumber_dofs() of DoF handler policity I MPI_Allgather all dofs provided to the function (lines 1049-1054 at dof_handler_policy.cc@33136):

                  std::vector<types::global_dof_index> new_numbers_copy (new_numbers);

  MPI_Barrier (tr->get_communicator ());

  MPI_Allgather (&new_numbers_copy[0],     new_numbers_copy.size (),

DEAL_II_DOF_INDEX_MPI_TYPE,

&gathered_new_numbers[0], new_numbers_copy.size (),

DEAL_II_DOF_INDEX_MPI_TYPE,

tr->get_communicator ());


However not all processors have the same gathered vector:

process 0:
0 14 1 15 2 16 3 17 4 5 18 6 7 8 19 9
process 1:
10 20 11 21 22 12 23 24 13 25 26 27
gathered at processor 0 :
0 14 1 15 2 16 3 17 4 5 18 6 7 8 19 9 10 20 11 21 22 12 23 24 13 25 26 27
gathered at processor 1 :
0 14 1 15 2 16 3 17 4 5 18 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


Any ideas what could be wrong? I am puzzled...

p.s. the test which produces these results is attached. 

Regards,
Denis.

 
problem_all_hanging_nodes.cc

Timo Heister

unread,
Jul 11, 2014, 1:00:21 PM7/11/14
to dea...@googlegroups.com
MPI_Allgather can only be used to transfer the same number of elements
from each rank which is not the case here, right? You should use
MPI_Allgatherv instead.
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



Denis Davydov

unread,
Jul 11, 2014, 1:13:09 PM7/11/14
to dea...@googlegroups.com
Thanks a lot, Timo. 
You are absolutely right, the number of element (locally owned DoFs) is different! 
My bad, will fix it...

Denis Davydov

unread,
Aug 4, 2014, 10:40:40 AM8/4/14
to dea...@googlegroups.com
in current state, results of sharedtria/dof_01.mpirun=3.debug  and sharedtria/dof_01.mpirun=3.release
changed. 
Not sure if the current numbers of locally owned DoFs are correct or previous ones. 
But the sum of DoFs is the same. Any idea how to find out? 

For the second test, the failing diff is something like:

< DEAL:2:3d::n_locally_owned_dofs: 41826
< DEAL:2:3d::n_locally_owned_dofs_per_processor: 13741 14200 13885  sum: 41826
---
> DEAL:2:3d::n_locally_owned_dofs: 4054582394
> DEAL:2:3d::n_locally_owned_dofs_per_processor: 14020  sum: 14020


so there is clearly a problem. In debug mode some Assert fails:

dof_handler.n_locally_owned_dofs() == dof_handler.locally_owned_dofs().n_elements()

and 

std::accumulate (dof_handler.n_locally_owned_dofs_per_processor().begin(), dof_handler.n_locally_owned_dofs_per_processor().end(), 0U) == N

so something is definitely messed up... i will try to find out what...

Denis Davydov

unread,
Aug 4, 2014, 10:51:36 AM8/4/14
to dea...@googlegroups.com
i wonder, is it generally safe to assume that number_cache.n_locally_owned_dofs_per_processor[]
does not change on renumbering=? 
So far I assume that it is true and use it inside dof_handler_policy.cc. I also have several Asserts related to this and none of them 
seems to be violated, maybe the problem is elsewhere...

Regards,
Denis.

Wolfgang Bangerth

unread,
Aug 4, 2014, 10:56:33 AM8/4/14
to dea...@googlegroups.com
My gut feeling was initially to say that this isn't safe to assume, but on
second thought I can't come up with a case where the number would change. A
DoF is located on some object in the triangulation (cell, edge, face) and
whether a processor owns it or not is solely determined by who owns the cell
and, if it's on an interface between cells, who owns the two adjacent cells.
The tie breaker doesn't look at the index of the DoF as far as I know, just
the owners of the two cells, so it should be immune to renumbering...

Having such assertions is probably a good idea nonetheless.

Best
W.


--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/

Denis Davydov

unread,
Aug 5, 2014, 2:18:07 AM8/5/14
to dea...@googlegroups.com
found a bug. Both dof01 and dof02 tests now fail on diff. 
Hopefully the current number of DoFs is the correct one...

will commit as soon as I switch to git...

Denis Davydov

unread,
Aug 11, 2014, 8:06:19 AM8/11/14
to dea...@googlegroups.com
i tried running tests on the Git repo, but i have some problems:

   make setup_tests (in the build folder) 

does not pickup "sharedtria" category.

Therefore when i try 

   ctest -j4 -R 'sharedtria'

it complains that "No tests were found!!!" 

There was a problem with CMakeLists.txt (wrong path to setup_testsubproject.cmake)
but i fixed it (on my github fork). Yet, that does not help with the issue. 
Is there any other place where one needs to "add" a category in the test suite?
Or maybe something has changed as compared to SVN version? 

Regards,
Denis.

Timo Heister

unread,
Aug 28, 2014, 4:06:31 PM8/28/14
to dea...@googlegroups.com
Denis,

in case you haven't figured it out by now: you have to add
"sharedtria" to tests/CMakeLists.txt under "categories".

How far along are you otherwise?

Denis Davydov

unread,
Aug 28, 2014, 4:56:26 PM8/28/14
to dea...@googlegroups.com
Thanks, Timo.
That helped. 

I just run all sharedtria tests on the git repo, they all failed. 
I will have a look what's wrong there, maybe some sources for diff are missing... 

Denis Davydov

unread,
Sep 9, 2014, 9:00:46 AM9/9/14
to dea...@googlegroups.com

On Thursday, August 28, 2014 10:56:26 PM UTC+2, Denis Davydov wrote:
I just run all sharedtria tests on the git repo, they all failed. 
I will have a look what's wrong there, maybe some sources for diff are missing... 

I had some paths missing. 
i just rerun the tests: 01 and 02 failed in DIFF stage for mpirun=3 (a slight redistribution of DoFs among processors). 
I hope the current answer is the correct one. 

To proceed, i will clean-up modifications.  
Maybe you, Timo, could have a look at it in comparison with the "vanilla" deal.ii 
to check if the structure/approach looks fine. 
If so, other renumbering routines could be modified in the same (already forgot which :D) manner. 
How does it sound?

Regards,
Denis.

Craig Michoski

unread,
Oct 23, 2014, 5:21:45 PM10/23/14
to dea...@googlegroups.com
Hi all,

Is there any progress on this issue?  I just saw this thread, but this issue is I think related to one I posted a few months back which was marked as resolved (I guess only the get_name error was resolved):

https://groups.google.com/forum/#!searchin/dealii/fetools$20interpolate/dealii/3lq3JDvhBh8/WBYifduSub4J

I put together a test of my own I have been trying to test this with (attached), which is somewhat different from Denis' problem, but they seem pretty closely linked.  I am on pretty much the current trunk I think.

Craig
interpolate_bug.cc

Denis Davydov

unread,
Oct 24, 2014, 1:56:22 AM10/24/14
to dea...@googlegroups.com
Hi Craig,

> 23 окт. 2014 г., в 23:21, Craig Michoski <mich...@gmail.com> написал(а):
>
> Hi all,
>
> Is there any progress on this issue? I just saw this thread, but this issue is I think related to one I posted a few months back which was marked as resolved (I guess only the get_name error was resolved):
>
> https://groups.google.com/forum/#!searchin/dealii/fetools$20interpolate/dealii/3lq3JDvhBh8/WBYifduSub4J
>

I think your issue is different.
What i have been fighting here is
making interpolate work in parallel
with shared triangulation.

That resulted in a new Triangulation class which does inside renumbering and partitioning of the shared triangulation among processes.
So it's kind of shared version of the distributed triangulation class.

The latest issues were with some renumbering functions.

Perhaps, i should merge in master and create a pull request to discuss (not merge) if the current way some problems are handled on the branch is ok.

Regards,
Denis

Craig Michoski

unread,
Oct 24, 2014, 12:08:49 PM10/24/14
to dea...@googlegroups.com
Hi Denis,

Okay.  Yeah, I guess they are separate issues, but they may share dependencies in the sense of making any changes.  So it might be easier to merge if we coordinate some, or at least get you merged back to master before I try getting this other thning worked out.  But I guess we should wait to hear from Timo or Wolfgang ....

I'm going to start a different thread for my issue then, just for clarity. 

Cheers,
Craig

Timo Heister

unread,
Oct 24, 2014, 1:48:37 PM10/24/14
to dea...@googlegroups.com
sorry, I didn't have time to look into this project recently.

What is currently working/not working?

> Perhaps, i should merge in master and create a pull request to discuss (not merge) if the current way some problems are handled on the branch is ok.

Yes, that sounds like a good idea. I will promise to look at it then.

Denis Davydov

unread,
Oct 24, 2014, 2:08:07 PM10/24/14
to dea...@googlegroups.com

> 24 окт. 2014 г., в 19:48, Timo Heister <hei...@clemson.edu> написал(а):
>
> sorry, I didn't have time to look into this project recently.
>
> What is currently working/not working?
>
>> Perhaps, i should merge in master and create a pull request to discuss (not merge) if the current way some problems are handled on the branch is ok.
>
> Yes, that sounds like a good idea. I will promise to look at it then.

Thanks, Timo. Will do that soon...

Best,
Denis

Denis Davydov

unread,
Oct 27, 2014, 8:33:31 AM10/27/14
to dea...@googlegroups.com
The pull request is created https://github.com/dealii/dealii/pull/212
I propose to shift further discussion to github, as this thread is getting super long
and kind of off-topic. 

Regards,
Denis.
Reply all
Reply to author
Forward
0 new messages