Load force is different with different number of processes

72 views
Skip to first unread message

Qing Yin

unread,
Mar 28, 2018, 12:27:59 AM3/28/18
to deal.II User Group
Dear community,

I meet a MPI problem when I simulate a three point bending test. I try to compute the load force traction on the top boundary. The results are correct
with the single process. But When I switched to a multi-process, the results changed.
Let's say, if we wan to get the load force in y direction, the results are like this:

* 1 core: -0.83
* 2 cores: -0.83
* 3 cores: -0.92
* 4 cores: -0.415
* 5 cores: -0.002

I already use `sum` collective operation after computing locally owned load force value.
In order to double check it, I try to output each processes' local value, and get the results
like this:

* 1 core: locally load force y: -0.831775
* 2 core: locally load force y: (0)-0.415888 (1)-0.415888
* 4 core: locally force y: (0)0 (1)-0.408131 (2)-0.00237332 (3)-0.004908

I do not know why I get smaller values with more cores.

I also have checked the boundary conditions and mesh size. In fact, the displacement results
are correct and same under different cores. So the problem should be just in this load force
computation.

Is anyone in the same boat before? Any suggestions will be greatly appreciate.

Thank you!

Best,
Qing

Wolfgang Bangerth

unread,
Mar 28, 2018, 2:47:58 PM3/28/18
to dea...@googlegroups.com

Qing,
There is not enough information in your email to say what may be wrong.
Do I read your description correctly to say that you have verified that
the *solution* is the same independent of the number of processors? If
that is so, then the problem must indeed be in the postprocessing step
where you compute the resulting forces. But without seeing your code
that does this computation, it's impossible to say what's wrong.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Qing Yin

unread,
Mar 28, 2018, 5:48:25 PM3/28/18
to dea...@googlegroups.com
Hi Prof. Bangerth,

I would like to post that snippet of code. To be honest, I verified the solution in a naive way, that is, I compare the two different number of processes solutions in paraview. The comparison(including data range, plot curve over time at a point) shows they are exactly the same.

template<int dim>
Tensor<1,dim> Postprocessor<dim>::
compute_load_force(const DoFHandler<dim> &dof_handler,
                   const LA::MPI::Vector &solution,
                   const double &lame_mu,
                   const double &lame_lambda,
                   const unsigned int &degree,
                   const unsigned int &boundary_id,
                   const MPI_Comm &mpi_comm) {

  const QGauss<dim-1> face_quadrature_formula (degree + 1);
  FEFaceValues<dim> fe_face_values (dof_handler.get_fe(),
                                    face_quadrature_formula,
                                    update_gradients | update_normal_vectors |
                                    update_JxW_values);

  const unsigned int dofs_per_cell = dof_handler.get_fe().dofs_per_cell;
  const unsigned int n_face_q_points = face_quadrature_formula.size();

  std::vector<unsigned int> local_dof_indices(dofs_per_cell);

  std::vector<std::vector<Tensor<1, dim>>> u_grads_per_cell(n_face_q_points,
                                                            std::vector<Tensor<1, dim>>(dim));

  // Declare load force per processor
  Tensor<1,dim> locally_owned_load_force;

  // Declare identity tensor
  SymmetricTensor<2, dim> identity = unit_symmetric_tensor<dim>();

  for (const auto &cell : dof_handler.active_cell_iterators())
    if (cell->is_locally_owned()) {
      for (unsigned int f = 0; f < GeometryInfo<dim>::faces_per_cell; ++f)
        // Find the top boundary based on the id
        if (cell->face(f)->at_boundary()
            && cell->face(f)->boundary_id() == boundary_id) {
          fe_face_values.reinit (cell, f);
          fe_face_values.get_function_gradients(solution, u_grads_per_cell);

          for (unsigned int q = 0; q < n_face_q_points; ++q) {

            // Compute strain and stress
            Tensor<2, dim> u_grad;
            for (unsigned int d = 0; d < dim; ++d)
              u_grad[d] = u_grads_per_cell[q][d];

            const SymmetricTensor<2, dim>
                epsilon = 0.5 * (u_grad + transpose(u_grad));

            const SymmetricTensor<2, dim>
                sigma = 2.0 * lame_mu * epsilon
                        + lame_lambda * trace(epsilon) * identity;

            locally_owned_load_force += sigma * fe_face_values.normal_vector(q)
                                        * fe_face_values.JxW(q);
          }
        }
    }

  // Collect values from all processors
  Tensor<1,dim> load_force;
  load_force[0] = Utilities::MPI::sum(locally_owned_load_force[0], mpi_comm);
  load_force[1] = Utilities::MPI::sum(locally_owned_load_force[1], mpi_comm);

  return load_force;
}


Thank you!

Best,
Qing



--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to a topic in the Google Groups "deal.II User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dealii/ptD-GDthQ3w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dealii+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wolfgang Bangerth

unread,
Mar 29, 2018, 3:09:24 PM3/29/18
to dea...@googlegroups.com
On 03/28/2018 03:48 PM, Qing Yin wrote:
>
> I would like to post that snippet of code. To be honest, I verified the
> solution in a naive way, that is, I compare the two different number of
> processes solutions in paraview. The comparison(including data range,
> plot curve over time at a point) shows they are exactly the same.
>
> template<int dim >
> Tensor<1,dim > Postprocessor<dim >::
> compute_load_force(const DoFHandler<dim > &dof_handler,
> const LA::MPI::Vector&solution,
> const double &lame_mu,
> const double &lame_lambda,
> const unsigned int &degree,
> const unsigned int &boundary_id,
> const MPI_Comm&mpi_comm) {
>
> const QGauss<dim-1> face_quadrature_formula(degree+ 1);
> FEFaceValues<dim > fe_face_values(dof_handler.get_fe(),
> face_quadrature_formula,
> update_gradients| update_normal_vectors|
> update_JxW_values);
>
> const unsigned int dofs_per_cell= dof_handler.get_fe().dofs_per_cell;
> const unsigned int n_face_q_points= face_quadrature_formula.size();
>
> std::vector<unsigned int> local_dof_indices(dofs_per_cell);
>
> std::vector<std::vector<Tensor<1, dim >>> u_grads_per_cell(n_face_q_points,
> std::vector<Tensor<1, dim >>(dim));
>
> // Declare load force per processor
> Tensor<1,dim > locally_owned_load_force;
>
> // Declare identity tensor
> SymmetricTensor<2, dim > identity= unit_symmetric_tensor<dim >();
>
> for (const auto &cell: dof_handler.active_cell_iterators())
> if (cell->is_locally_owned()) {
> for (unsigned int f= 0; f< GeometryInfo<dim >::faces_per_cell; ++f)
> // Find the top boundary based on the id
> if (cell->face(f)->at_boundary()
> && cell->face(f)->boundary_id() == boundary_id) {
> fe_face_values.reinit(cell, f);
> fe_face_values.get_function_gradients(solution, u_grads_per_cell);
>
> for (unsigned int q= 0; q< n_face_q_points; ++q) {
>
> // Compute strain and stress
> Tensor<2, dim > u_grad;
> for (unsigned int d= 0; d< dim; ++d)
> u_grad[d] = u_grads_per_cell[q][d];
>
> const SymmetricTensor<2, dim >
> epsilon= 0.5 * (u_grad+ transpose(u_grad));
>
> const SymmetricTensor<2, dim >
> sigma= 2.0 * lame_mu* epsilon
> + lame_lambda* trace(epsilon) * identity;
>
> locally_owned_load_force+= sigma* fe_face_values.normal_vector(q)
> * fe_face_values.JxW(q);
> }
> }
> }
>
> // Collect values from all processors
> Tensor<1,dim > load_force;
> load_force[0] = Utilities::MPI::sum(locally_owned_load_force[0], mpi_comm);
> load_force[1] = Utilities::MPI::sum(locally_owned_load_force[1], mpi_comm);
>
> return load_force;
> }

That's pretty much exactly how I would have written this code myself. I
don't see anything that is obviously wrong. You'll have to debug this a
bit, for example by outputting epsilon or sigma or sigma \dot n for each
quadrature point and comparing what you get one one processor with what
you get on two or four processors. If you choose the mesh coarse enough,
you should be able to see where these quadrature points are and what the
values are. Then, if you know whether the stresses are the same, you
know whether the problem is with the integration routine above, or
whether it is with the solution that is passed to the function. There is
no easier way to find out what the issue is, I believe, given that the
code "looks ok to me".

I assume that you are obviously running in debug mode.
Reply all
Reply to author
Forward
0 new messages