extract_dofs() extremely slow in for single-thread run?

develo...@googlemail.com

unread,

Aug 31, 2020, 4:58:05 AM8/31/20

to deal.II User Group

I'd like to simulate four coupled PDEs (with variables A, B, C and D) using deal.II. At each step in time I'm interested in the maximum and minimum value of each variable. Therefore, I have to loop over my solution, and gather the maximum and minimum value of each variable by using

vector_t local_solution;

present_solution.update_ghost_values();

local_solution = present_solution;

IndexSet locally_owned_elements = local_solution.locally_owned_elements();

locally_owned_elements.compress();

for(auto index: present_solution.locally_owned_elements()) {

if(A_value_mask.is_element(locally_owned_elements.index_within_set(index))) {

max_A_value = std::max(max_A_value, static_cast<double>(local_solution(index)));

min_A_value = std::min(min_A_value, static_cast<double>(local_solution(index)));

}

To create A_value_mask, I use

const FEValuesExtractors::Scalar surface_A(0);

const ComponentMask surface_A_mask = fe.component_mask(surface_A);

IndexSet A_value_mask = DoFTools::extract_dofs(dof_handler, surface_A_mask);

For 2.4 million elements and four variables (i.e. four calls), this second part (more specifically the call extract_dofs()) takes ~120 sec., which takes up ~30% of the total run time. I have to use such a construct three times in my code, resulting in 360 sec. spend in extracting dofs, and ~40 sec. in the other parts of the program. Is that by design, or is there something wrong in my code?

Thanks!

Jean-Paul Pelteret

unread,

Sep 1, 2020, 2:28:18 PM9/1/20

to dea...@googlegroups.com

Hi,

30-40% of the runtime for this operation seems a bit excessive to me. Maybe you should try using the FEValuesExtractors in conjunction with FEValues in order to get the solution at the quadrature points. If you want the solution at the support points, then you can just create a quadrature rule that has the quadrature points at the support points and initialise your FEValues object with it.

So this bit of pseudo-code illustrates roughly how it could be done:

// https://dealii.org/current/doxygen/deal.II/classQuadrature.html#ac002803e6ec722da5a0de102574110c9

// https://dealii.org/current/doxygen/deal.II/classFiniteElement.html#ab4f6e0c83686b918fbb92716ead92313

const Quadrature<dim> q_cell_support_points (fe.unit_support_points()); // Assumes that your FE has unit support points; could also use the unit cell vertex positions, something else...

const FEValues<dim> fe_values(fe, q_cell_support_points, update_values);

const FEValuesExtractors::Scalar surface_A(0);

Vector<double> solution_vector = …; // If this is an MPI vector, then still use the same basic approach

double max_value = std::numeric_limits<double>::min();

for (auto cell : dof_handler.active_cell_iterators())

{

if (!cell->locally_owned()) continue;

fe_values.reinit(cell);

// https://dealii.org/current/doxygen/deal.II/classFEValuesViews_1_1Scalar.html#a7aa6b0275facea985f23a64ad690d9b1

std::vector<double> q_point_solution_values (fe_values.n_quadrature_points);

fe_values[surface_A].get_function_values(solution_vector , q_point_solution_values);

// This will get you the maximum of the solution at the queried points

// for the local process.

for (auto entry : q_point_solution_values)

max_value = std::max(max_value, entry);

// Since you’re doing the work with a cell loop, if appropriate then you could also

// retrieve the max for the other fields at the same time.

}

// If the solution is distributed, then you still need the max over all processes:

// https://dealii.org/current/doxygen/deal.II/namespaceUtilities_1_1MPI.html#ad2f716b789abe53715d6659f38aa7815

max_value = Utilities::MPI::max(max_value, mpi_communicator);

I think that’s more or less how you could do it using the established classes and methodology. Hopefully it would yield you better performance,

since you don’t have to fetch a huge index set mask. The other benefit is that what I’ve written here makes no assumptions about how the coefficients of the solution vector relate to the physical solution.

I hope that this helps you out. It would be great to hear whether or not this sort of approach yields a performance benefit.

Best,

Jean-Paul

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/80a29cd6-d2e8-447d-a0a9-89b2e0fc4b32n%40googlegroups.com.

develo...@googlemail.com

unread,

Sep 2, 2020, 8:11:07 AM9/2/20

to deal.II User Group

I'm using FESystem for defining fe, and therefore I get "unit_support_points is protected within this context". Is there another way to get access to those points?

Moreover, defining fe_values as const disables the reinit()-call, and I have to call cell->is_locally_owned(), instead of cell->locally_owned().

Thanks!

Jean-Paul Pelteret

unread,

Sep 2, 2020, 2:34:56 PM9/2/20

to dea...@googlegroups.com

I was close — it’s

fe.get_unit_support_points() // https://www.dealii.org/developer/doxygen/deal.II/classFiniteElement.html#a5b35a290aa7dd7562911a92a13b11fee

Moreover, defining fe_values as const disables the reinit()-call, and I have to call cell->is_locally_owned(), instead of cell->locally_owned().

Coding in an email client is unforgiving, so I’m glad that you were able to work past these errors.

Best,

Jean-Paul

To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/4bad1bae-1f4d-4d82-9396-d5f5299b8aa4n%40googlegroups.com.

develo...@googlemail.com

unread,

Sep 3, 2020, 2:56:15 AM9/3/20

to deal.II User Group

Yes, that worked, thanks!

Now, when comparing, it takes 132 seconds for my initial version, and 0.41 seconds for the version you proposed. Why is my approach so much more expensive?

Thanks!

Maxi Miller

unread,

Sep 3, 2020, 3:16:50 AM9/3/20

to dea...@googlegroups.com

Another issue I have is for the following code piece, where I basically have to rescale every part of the solution by a different factor (due to using unitless variables in the calculation):

const ComponentMask surface_A_mask = fe.component_mask(surface_A);

const ComponentMask surface_B_mask = fe.component_mask(surface_B);
const ComponentMask surface_C_mask = fe.component_mask(surface_C);
const ComponentMask surface_D_mask = fe.component_mask(surface_D);

IndexSet A_value_mask = DoFTools::extract_dofs(dof_handler, surface_A_mask);

IndexSet B_value_mask = DoFTools::extract_dofs(dof_handler, surface_B_mask);
IndexSet C_value_mask = DoFTools::extract_dofs(dof_handler, surface_C_mask);
IndexSet D_value_mask = DoFTools::extract_dofs(dof_handler, surface_D_mask);

IndexSet locally_owned_elements = scaled_solution.locally_owned_elements();

A_value_mask.compress();
B_value_mask.compress();
C_value_mask.compress();
D_value_mask.compress();
locally_owned_elements.compress();

vector_t unified_test_vector(dof_handler.locally_owned_dofs(), mpi_communicator);

for(auto index : scaled_solution.locally_owned_elements()) {
     if(A_value_mask.is_element(locally_owned_elements.index_within_set(index)))
      {
          unified_test_vector(index) = unitless_recalculator.scale_A_solution(scaled_solution(index));
     }
     if(B_value_mask.is_element(locally_owned_elements.index_within_set(index)))
     {
          unified_test_vector(index) = unitless_recalculator.scale_B_solution(scaled_solution(index));
     }
     if(C_value_mask.is_element(locally_owned_elements.index_within_set(index)))
     {
          unified_test_vector(index) = unitless_recalculator.scale_C_solution(scaled_solution(index));
     }
     if(D_value_mask.is_element(locally_owned_elements.index_within_set(index)))
     {
          unified_test_vector(index) = unitless_recalculator.scale_D_solution(scaled_solution(index));
     }
}

Again, extract_dofs takes most of the time here. Is there an alternative approach for that, too?

You received this message because you are subscribed to a topic in the Google Groups "deal.II User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dealii/G95_BtRjONo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/7a8c906b-67c0-4741-890f-d6699cf2225dn%40googlegroups.com.

Wolfgang Bangerth

unread,

Sep 3, 2020, 7:43:35 AM9/3/20

to dea...@googlegroups.com

On 9/3/20 1:16 AM, 'Maxi Miller' via deal.II User Group wrote:
>
> Again, extract_dofs takes most of the time here. Is there an alternative
> approach for that, too?

There is a variation of extract_dofs() that returns its results in the form of
a std::vector<bool> (as one of the arguments) instead of an IndexSet. Could
you try and see whether that is substantially faster?

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Martin Kronbichler

unread,

Sep 3, 2020, 7:55:40 AM9/3/20

to dea...@googlegroups.com

Indeed the other function was fast. I looked into the function already
and have a fix posted:

https://github.com/dealii/dealii/pull/10883

Best,
Martin

develo...@googlemail.com

unread,

Sep 3, 2020, 8:13:37 AM9/3/20

to deal.II User Group

I'm following the discussion, and will retest my program as soon as the commit has been approved and merged. Thanks!

develo...@googlemail.com

unread,

Sep 4, 2020, 4:27:41 AM9/4/20

to deal.II User Group

Using the merged commit my runtime decreased now from 120 sec. per call to ~1-2 sec. per call, thanks for the fast help!

Reply all

Reply to author

Forward