Frequent "Runtime process errors: Unable to connect to the server: net/http: TLS handshake timeout" when issuing kubcetl create

3,550 views
Skip to first unread message

bg303

unread,
May 29, 2018, 11:59:13 AM5/29/18
to Kubernetes user discussion and Q&A
I have scheduled jobs that run throughout the day. These jobs create deployment YML files, authenticate with another Kubernetes cluster, and issue "kubectl create -f job.yml"

Dozens of times a day, the job fails because this response comes back from Kubernetes:

"Runtime process errors: Unable to connect to the server: net/http: TLS handshake timeout"

The job gets retried seconds later and succeeds, but I would like to reduce/eliminate the error from happening.

Does anyone know why this happens? The commands are being issued inside a container running in a separate Kubernetes cluster in the same GCP region.

Daniel Smith

unread,
May 29, 2018, 2:39:06 PM5/29/18
to kubernet...@googlegroups.com
Either you have a flaky network or the apiserver(s) you're trying to contact are crashing a lot. My guess is the latter. My guess is most people observing that have done something like create 5-digit numbers of Jobs on an underprovisioned control plane (e.g., 1 core VM).

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Daniel Smith

unread,
May 29, 2018, 2:41:05 PM5/29/18
to kubernet...@googlegroups.com
I wrote that without noticing that you actually say you're using Jobs :)

Please look at whether you clean up (delete) finished Jobs. There's a pod gc that deletes old finished pods but there's not one for Jobs. I bet it will get better if you delete your old, finished Jobs.

bg303

unread,
May 29, 2018, 3:31:28 PM5/29/18
to Kubernetes user discussion and Q&A

Ah, that might do it. I have thousands of old jobs that have never been cleaned up.

I'm curious, is this a known issue/is this something that would ever be addressed in later release?

Daniel Smith

unread,
May 29, 2018, 3:44:35 PM5/29/18
to kubernet...@googlegroups.com
This is definitely a known issue to me :)

But it seems like there actually isn't an issue tracking this. I filed https://github.com/kubernetes/kubernetes/issues/64470
Reply all
Reply to author
Forward
0 new messages