How to add perturbations when the loss function is stuck at gradient zero

Suneel Kumar

unread,

Aug 21, 2024, 3:34:20 AM8/21/24

to Ceres Solver

Hi,

I was using a custom-made non-linear equation solver for a few months to solve basic geometric equations such as fixing distance between two points, fixing angle between two lines etc,. I created the solver by writing costs, respective jacobains wrt to vertices involved and solved using vanilla newton-raphson method.

I shifted to Ceres solver recently as it has dynamic auto diff and for robustness. But, I am facing a problem related to perturbations.

Sometimes, say Two points should touch each other, I solve only for 6 Degrees of Freedom values of two object (Pos x, y, z and Rot x, y, z) and transform the entire objects. These values, Pos & Rot commonly bear 0 values, hence to avoid getting stuck in zero gradient because of "Params" being zero, I perturb these values by 1e-3 when (cost function > threshold and jacobian norm < threshold) and iterations continue. Is there any such feature in Ceres like that.

P.S. I explored custom callbacks, but seems like the address given to the params and the address beared by the params in the cost function's operator is different. Can I get a suggestion on how to deal this in Ceres way?

Sameer Agarwal

unread,

Aug 21, 2024, 9:08:20 AM8/21/24

to ceres-...@googlegroups.com

There is no perturbation ability in Ceres. That said can you describe your objective function. I am curious as to why your gradient zero when your parameter values are zero?

--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/751bdbe1-3869-4dd1-82bd-e94da2fd3dban%40googlegroups.com.

Suneel Kumar

unread,

Aug 23, 2024, 2:30:28 AM8/23/24

to Ceres Solver

Below is the operator for my dynamic auto diff cost function. At every iteration additional constraints such as fixing solid element 1
and applying parallel cost between two lines belonging to solid element 1 and solid element 2 to align the line of solid element 2 with
that of solid element 1.

The parameters (xd1, yd1, zd1) often have (0, 0, 1) type of values adn
(rot_x, rot_y, rot_z) often have (0, Pi, 2*Pi) kind of values.

template <typename T>
bool ParallelLines3D_Cost::operator()(const T* const* variables, T* residual) const {

//Rotation of first solid element
const T& rot_x1 = variables[0][ var_indices[0] ];
const T& rot_y1 = variables[0][ var_indices[1] ];
const T& rot_z1 = variables[0][ var_indices[2] ];

//Rotation of second solid element
const T& rot_x2 = variables[0][ var_indices[3] ];
const T& rot_y2 = variables[0][ var_indices[4] ];
const T& rot_z2 = variables[0][ var_indices[5] ];

(linedir_x1, linedir_y1, linedir_z1) => Global line direction values
(linedir_x2, linedir_y2, linedir_z2) => Global line direction values

(xd1, yd1, zd1) => Local Line 1 Dir values relative to solid element 1
(xd2, yd2, zd2) => Local Line 2 Dir values relative to solid element 2

T linedir_x1 = ( ( ( (ace_ptr->xd1) * cos( rot_y1 ) + ( (ace_ptr->yd1) * sin( rot_x1 ) + (ace_ptr->zd1) * cos( rot_x1 ) ) * sin( rot_y1 ) ) * cos( rot_z1 ) )
- ( ( (ace_ptr->yd1) * cos( rot_x1 ) - (ace_ptr->zd1) * sin( rot_x1 ) ) * sin( rot_z1 ) ) );

T linedir_x2 = ( ( ( (ace_ptr->xd2) * cos( rot_y2 ) + ( (ace_ptr->yd2) * sin( rot_x2 ) + (ace_ptr->zd2) * cos( rot_x2 ) ) * sin( rot_y2 ) ) * cos( rot_z2 ) )
- ( ( (ace_ptr->yd2) * cos( rot_x2 ) - (ace_ptr->zd2) * sin( rot_x2 ) ) * sin( rot_z2 ) ) );

T linedir_y1 = ( ( ( (ace_ptr->xd1) * cos( rot_y1 ) + ( (ace_ptr->yd1) * sin( rot_x1 ) + (ace_ptr->zd1) * cos( rot_x1 ) ) * sin( rot_y1 ) ) * sin( rot_z1 ) )
+ ( ( (ace_ptr->yd1) * cos( rot_x1 ) - (ace_ptr->zd1) * sin( rot_x1 ) ) * cos( rot_z1 ) ) );

T linedir_y2 = ( ( ( (ace_ptr->xd2) * cos( rot_y2 ) + ( (ace_ptr->yd2) * sin( rot_x2 ) + (ace_ptr->zd2) * cos( rot_x2 ) ) * sin( rot_y2 ) ) * sin( rot_z2 ) )
+ ( ( (ace_ptr->yd2) * cos( rot_x2 ) - (ace_ptr->zd2) * sin( rot_x2 ) ) * cos( rot_z2 ) ) );

T linedir_z1 = ( ( -1*(ace_ptr->xd1)*sin( rot_y1 ) + ( (ace_ptr->yd1) * sin( rot_x1 ) + (ace_ptr->zd1) * cos( rot_x1 ) ) )*cos( rot_y1 ) );

T linedir_z2 = ( ( -1*(ace_ptr->xd2)*sin( rot_y2 ) + ( (ace_ptr->yd2) * sin( rot_x2 ) + (ace_ptr->zd2) * cos( rot_x2 ) ) )*cos( rot_y2 ) )

//Cross product of lines is zero for parallel lines
residual[0] = linedir_y1*linedir_z2 - linedir_z1*linedir_y2;
residual[1] = linedir_x1*linedir_z2 - linedir_z1*linedir_x2;
residual[2] = linedir_x1*linedir_y2 - linedir_y1*linedir_x2;

return true;

Suneel Kumar

unread,

Sep 4, 2024, 1:40:56 AM9/4/24

to Ceres Solver

Hi Sameer,

Any suggestions for this?

Sameer Agarwal

unread,

Sep 4, 2024, 9:22:49 AM9/4/24

to ceres-...@googlegroups.com

Sorry this dropped off my radar.

Your residual is the cross product of two vectors, while it is the case that it will be zero if the two vectors are orthogonal, it is also the case that it will be zero if one of them is zero. Or rather the magnitude of your residual does not scale with orthogonality, it scales with orthogonality AND the scale of the vectors themselves.

One thing to try is to refine the residual on the normalized vectors, i.e.

T norm1 = sqrt(linedir_x1*linedir_x1 + linedir_y1*linedir_y1 + linedir_z1*linedir_z1);

T norm2 = sqrt(linedir_x2*linedir_x2 + linedir_y2*linedir_y2 + linedir_z2*linedir_z2);

residual[0] = (linedir_y1*linedir_z2 - linedir_z1*linedir_y2)/(norm1 * norm2);
residual[1] = (linedir_x1*linedir_z2 - linedir_z1*linedir_x2)/(norm1 * norm2);
residual[2] = (linedir_x1*linedir_y2 - linedir_y1*linedir_x2)/(norm1 * norm2);

Sameer

To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/0ac4fc82-cbd8-4629-aa73-c64836c56a05n%40googlegroups.com.

Reply all

Reply to author

Forward