Cert-Manager failing because associated order is in an “invalid” state

37 views
Skip to first unread message

Mohd Rashid

unread,
Mar 9, 2025, 3:48:45 PMMar 9
to cert-manager-dev
Hi,

I'm unable to create certificate and cluster-issuer using helm chart getting error "The certificate request has failed to complete and will be retried: Failed to wait for order resource "ml-models-tls-secret-1-3822340619" to become ready: order is in "invalid" state" Im using helm chart for deploying

nginx-ingress-controller
cert-manager
cert-manager-issuer
my service/deployment
All this 4 im deploying using helm chart in AKS Cluster Below is the certificate showing False in ready state

kubectl get certificate -n test
NAME                   READY   SECRET                 AGE
ml-models-tls-secret   False   ml-models-tls-secret   88s
Here is the command to describe in details

kubectl describe certificate ml-models-tls-secret -n test
Events:
  Type     Reason     Age   From                                       Message
  ----     ------     ----  ----                                       -------
  Normal   Issuing    114s  cert-manager-certificates-trigger          Issuing certificate as Secret does not exist
  Normal   Generated  114s  cert-manager-certificates-key-manager      Stored new private key in temporary Secret resource "ml-models-tls-secret-xf8vl"
  Normal   Requested  114s  cert-manager-certificates-request-manager  Created new CertificateRequest resource "ml-models-tls-secret-1"
  Warning  Failed     82s   cert-manager-certificates-issuing          The certificate request has failed to complete and will be retried: Failed to wait for order resource "ml-models-tls-secret-1-3822340619" to become ready: order is in "invalid" state:
Here is showing secret

kubectl get secret -n test
NAME                                        TYPE                 DATA   AGE
sh.helm.release.v1.cert-manager-issuer.v1   helm.sh/release.v1   1      3m1s
sh.helm.release.v1.ml-models.v1             helm.sh/release.v1   1      2m23s
Here is the ingress attached to correct IP Address

kubectl get ingress -n test
NAME                CLASS   HOSTS             ADDRESS          PORTS     AGE
ingress-ml-models   nginx   me.ml.test.ai    20.233.205.227   80, 443   6m35s
Here is cluster issuer showing state in True

kubectl get clusterissuer
NAME             READY   AGE
letsencrypt-me   True    8m1s
Here is showing order in invalid state

kubectl get order -n test
NAME                                STATE     AGE
ml-models-tls-secret-1-3822340619   invalid   7m51s
Here is showing challenges in invalid state

kubectl get challenges -n test
NAME                                           STATE     DOMAIN            AGE
ml-models-tls-secret-1-3822340619-3896448402   invalid   me.ml.test.ai    9m15s
kubectl logs pod/cert-manager-8576d99cc8-vw4sj -n cert-manager

sync.go:403] "error waiting for authorization" err="acme: authorization error for me.ml.test.ai: 400 urn:ietf:params:acme:error:connection: 20.233.205.227: Fetching http://me.ml.test.ai/.well-known/acme-challenge/R1665D99bj_6hF1uG69ajDId8xXilq8rjomXrSG8T1o: Timeout during connect (likely firewall problem)" logger="cert-manager.controller.acceptChallenge" resource_name="ml-models-tls-secret-1-3822340619-3896448402" resource_namespace="test" resource_kind="Challenge" resource_version="v1" dnsName="me.ml.test.ai" type="HTTP-01" E0309 11:27:01.183367 1 controller.go:104] "Unhandled Error" err="ingress 'test/cm-acme-http-solver-wwbc6' in work queue no longer exists" logger="UnhandledError" I0309 11:27:01.568965 1 conditions.go:201] "Found status change for Certificate condition; setting lastTransitionTime" logger="cert-manager" certificate="test/ml-models-tls-secret" condition="Issuing" oldStatus="True" status="False" lastTransitionTime="2025-03-09 11:27:01.56894821 +0000 UTC m=+15172.283709596" I0309 11:27:01.582382 1 trigger_controller.go:202] "Backing off from issuance due to previously failed issuance(s). Issuance will next be attempted at 2025-03-09 12:27:01.0000008 +0000 UTC m=+18771.714762286" logger="cert-manager.controller" key="test/ml-models-tls-secret" I0309 11:27:01.611463 1 trigger_controller.go:202] "Backing off from issuance due to previously failed issuance(s). Issuance will next be attempted at 2025-03-09 12:27:01.0000007 +0000 UTC m=+18771.714762086" logger="cert-manager.controller" key="test/ml-models-tls-secret" E0309 11:27:01.885881 1 sync.go:75] "failed to update status" logger="cert-manager.controller" resource_name="ml-models-tls-secret-1-3822340619" resource_namespace="test" resource_kind="Order" resource_version="v1" I0309 11:27:01.885920 1 controller.go:152] "re-queuing item due to optimistic locking on resource" logger="cert-manager.controller" error="Operation cannot be fulfilled on orders.acme.cert-manager.io\"ml-models-tls-secret-1-3822340619\": the object has been modified; please apply your changes to the latest version and try again" lated_resource_kind="" related_resource_version="" E0309 11:26:04.054167 1 sync.go:208] "propagation check failed" err="wrong status code '502', expected '200'" logger="cert-manager.controller" resource_name="ml-models-tls-secret-1-3822340619-1399653640" resource_namespace="test" resource_kind="Challenge" resource_version="v1" dnsName="me.ml.test.ai " type="HTTP-01"
Please tell me where im wrong and i did it wrong

and also tell which one should i deploy first ingress-nginx or cert-manager or letsenypt

Peter Fiddes

unread,
Mar 27, 2025, 5:35:14 AMMar 27
to cert-manager-dev
> and also tell which one should i deploy first ingress-nginx or cert-manager or letsenypt

You will want to deploy them in the order your listed initially.

The issue from a glance appears to be this:

> Timeout during connect (likely firewall problem)

You should run a debug pod in the cert-manager namespace and validate if you can nslookup or use dig on the domain. "me.ml.test.ai".
This is to check that from where cert-manager is, it can resolve the domain and reach it.

Thanks, Peter
Reply all
Reply to author
Forward
0 new messages