Why the backoff duration of failed CertificateRequest is hard coded to an hour?

158 views
Skip to first unread message

Yelei Wu

unread,
Dec 28, 2020, 1:34:52 AM12/28/20
to cert-manager-dev
Hi there,

We have been using cert-manager to request certificates from Let's encrypt and occasionally encounters a problem that Let's encrypt cannot verify the DNS01 challenge response even though the self-check of cert-manager is succeed. Deleting the CertificateRequest to trigger a retry always solve the problem, however, the default retry backoff on failed CertificateRequest is hard coded to an hour in cert-manager:

```
log.Info("the failed existing certificate request failed less than an hour ago, will be scheduled for reprocessing in an hour")
```

I'm wondering is there any specific reason to choose this duration. And if not, should we make it configurable?

Best Regards,
Aylei

Maël Valais

unread,
Jun 17, 2021, 3:29:09 AM6/17/21
to cert-manager-dev
That is a great question, so I digged into the arbitrary "one hour" that cert-manager uses. In trigger_controller.go:

// the amount of time after the LastFailureTime of a Certificate
// before the request should be retried.
// In future this should be replaced with a more dynamic exponential
// back-off algorithm.
retryAfterLastFailure = time.Hour

The commit itself does not give further indications except for the fact that one hour seemed to be a reasonable backoff duration.

As a cert-manager user, I would probably expect an exponential backoff instead of a fixed one. And I would probably expect the backoff upper limit to be something like one hour. 
Reply all
Reply to author
Forward
0 new messages