PETSc Solver error in Random Timestep

63 views

Skip to first unread message

Zacharias-Panagiotis Oikonomou

unread,

Oct 25, 2023, 10:05:02 AM10/25/23

to deal.II User Group

Dear all,

I am an engineering student at NTUA, and for my thesis, I am trying to simulate the dynamic response of a cylidrical shaft.

The code is based on the step_18 tutorial and on step_40 for the efficient distribution of the RAM load. In essence, it tries to model the Newmark-Beta/HHT integration method in order to solve the elastic wave equation.

The problem seems to be that in a random timestep, the variable named temp takes a NaN value, so I solved the problem with the if(isnan()) condition (as shown below), but I was wondering if anyone has encountered this problem and how they solved it.

If the condition does not exist, the error is the following, also encountered when running the code in series:

Timestep 84 at time 8.4
Assembling system: timestep = 84

Total Area of the propeller shaft: 0.374583
Total Area of section: 0.489861
Pressure= 0 for RPM = 80
Centroid= -6 -4.13827e-05 1.79398e-05
Polar Moment of Inertia J= 0.0381965 Starting to construct matrices for 0 is done.
Matrices for the 0 process are done.
Predictors calculations
Newton Matrices Assembled
Residual Linear Calced. norm of rhs is -nan
norm of rhs is 2.26552e+10
---------------------------------------------------------
TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped
to avoid a possible deadlock.
---------------------------------------------------------

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

----------------------------------------------------
Exception on processing:

--------------------------------------------------------
An error occurred in line <134> of file </home/zaxosoik/dealii/source/lac/petsc_solver.cc> in function
void dealii::PETScWrappers::SolverBase::solve(const dealii::PETScWrappers::MatrixBase&, dealii::PETScWrappers::VectorBase&, const dealii::PETScWrappers::VectorBase&, const dealii::PETScWrappers::PreconditionBase&)
The violated condition was:
false
Additional information:
The iterative method reported a convergence failure in step 4294967295. The
The residual in the last step was nan.

This error message can indicate that you have simply not allowed a
sufficiently large number of iterations for your iterative solver to
converge. This often happens when you increase the size of your
problem. In such cases, the last residual will likely still be very
small, and you can make the error go away by increasing the allowed
number of iterations when setting up the SolverControl object that
determines the maximum number of iterations you allow.

The other situation where this error may occur is when your matrix is
not invertible (e.g., your matrix has a null-space), or if you try to
apply the wrong solver to a matrix (e.g., using CG for a matrix that
is not symmetric or positive-definite). In these cases, the
residual in the last iteration is likely going to be large.

Stacktrace:
-----------
#0 /usr/local/lib/libdeal_II.so.9.6.0-pre: dealii::PETScWrappers::SolverBase::solve(dealii::PETScWrappers::MatrixBase const&, dealii::PETScWrappers::VectorBase&, dealii::PETScWrappers::VectorBase const&, dealii::PETScWrappers::PreconditionBase const&)
#1 ./parametric_shaft_dynamic_v9: ParametricShaft::TopLevel<3>::solve_linear_problem()
#2 ./parametric_shaft_dynamic_v9: ParametricShaft::TopLevel<3>::solve_timestep()
#3 ./parametric_shaft_dynamic_v9: ParametricShaft::TopLevel<3>::do_timestep()
#4 ./parametric_shaft_dynamic_v9: ParametricShaft::TopLevel<3>::run()
#5 ./parametric_shaft_dynamic_v9: main
--------------------------------------------------------

Aborting!

The problem seems to not be limited to the rhs.l2_norm being -nan, as I have encountered it even if the norm is a Real number (not shown here)

The problem stems from the use of FEFaceValues values from the following code:

  for (unsigned int face = 0; face < n_faces_per_cell; ++face)
            {


              if (cell->face(face)->at_boundary() && (cell->face(face)->boundary_id() == 1)
              //if (cell->boundary_id() == 1)
                  {
                         
                            fe_face_values.reinit(cell, face);
                            //fe_values.reinit(cell);
                            //if (faces_touched[])
                            propeller_pressure.vector_value_list(fe_face_values.get_quadrature_points(), 
                                                                    propeller_pressure_values);
                                        
                            propeller_torque.vector_value_list(fe_face_values.get_quadrature_points(), 
                                                      propeller_torque_values);
  
                          
                          

                            for (unsigned int q_point = 0; q_point < n_faces_per_cell;
                                ++q_point)
                              {
                                Tensor<1, dim> rhs_values;
                                for (unsigned int i = 0; i < dim; ++i)
                                  {
                                    rhs_values[i] = propeller_torque_values[q_point][i]+propeller_pressure_values[q_point][i];
                                  }
                                for (unsigned int i = 0; i < dofs_per_cell; ++i)            
                                  {
                                        const unsigned int component_i =
                                                                    fe.system_to_component_index(i).first;
                                        double temp = fe_face_values.shape_value(i,q_point) *
                                            propeller_torque_values[q_point](component_i) *
                                            fe_face_values.JxW(q_point);
                                            if (isnan(temp))
                                                cell_rhs(i) +=0;
                                            else
                                                cell_rhs(i) += temp; 
                                        countboundary1 +=1;


                                  }
                                }
                   }

The function called propeller_torque values is the following:

template<int dim>
  class PropellerTorque : public Function<dim>
  {
    public:
      PropellerTorque(const double torque, const double polar_moment, const double radius);

      virtual void vector_value(const Point<dim> &p,
                                Vector<double> &values) const override;
      virtual void 
      vector_value_list(const std::vector<Point<dim>> &points,
                      std::vector<Vector<double>> &  value_list) const override;
      
    private:
      const double torque; 
      const double polar_moment;
      const double radius;
  };

  template <int dim>
  PropellerTorque<dim>::PropellerTorque(const double torque, const double polar_moment, const double radius)
    : Function<dim>(dim), torque(torque), polar_moment(polar_moment), radius(radius)
  {}

  template <int dim>
  inline void PropellerTorque<dim>::vector_value(const Point<dim> &p,
                                                Vector<double> &values) const
  {
    AssertDimension(values.size(), dim);
    double y = p(1);
    double z = p(2);
    double distance = sqrt(y*y+z*z);
    if (distance<=0)
        distance=0;
    double angle = atan2(z,y);
    //if (angle
    double tau = -torque*radius/(polar_moment);
    if (isnan(tau))
        tau=0;
     if (isnan(angle))
        {
            angle=0;
            tau=0;
        }
    
    values = 0;
    values(1) = tau* sin(angle);
    values(2) = -tau* cos(angle);

  }

  template <int dim>
  inline void PropellerTorque<dim>::vector_value_list(
    const std::vector<Point<dim>> &points,
    std::vector<Vector<double>> &  value_list) const
  {
    const unsigned int n_points = points.size();
 
    AssertDimension(value_list.size(), n_points);
 
    for (unsigned int p = 0; p < n_points; ++p)
      PropellerTorque<dim>::vector_value(points[p], value_list[p]);
  }

Attached, you can find the code in.cpp and the parameters.xml that are required.

Thank you in advance for your response, and thank you for creating the tutorials.

Kind Regards,

Oikonomou Zacharias

SNAME NTUA, Athens, Greece

parameters.xml

parametric_shaft_dynamic_v9.cpp

Wolfgang Bangerth

unread,

Oct 25, 2023, 7:38:23 PM10/25/23

to dea...@googlegroups.com

On 10/25/23 08:05, Zacharias-Panagiotis Oikonomou wrote:
> *The problem seems to be that in a random timestep, the variable named
> /temp /takes a NaN value, so I solved the problem with the if(isnan())
> condition *(as shown below), but I was wondering if anyone has

> encountered this problem and how they solved it.

Well, you didn't actually *solve* the problem, you just ignored it with
the if() statement :-)

The way to debug this is to figure out where the NaN came from. To
understand how to do this, note that you end up with a NaN value in your
right hand side vector. But you assembled that rhs vector. So in the
function that assembles the rhs, check every time you write something
(namely, a local contribution) into the vector, whether that something
is a NaN -- that's the only way a NaN can have gotten into your rhs vector!

When you have verified that the NaN gets into the vector through the
assembly, start looking through your assembly function and try to
understand how that NaN got there. Perhaps, you divided by zero
somewhere. Or you took the square root of a negative number.

Best
W.

Reply all

Reply to author

Forward

0 new messages