Kubernetes plugin - pod ready timeout

602 views

Skip to first unread message

Romain Grécourt

unread,

Jun 25, 2018, 7:47:05 PM6/25/18

to Jenkins Users

Hi,

I'm doing a Jenkins pipeline with the kubernetes plugin (1.8.2).

I have stages running in parallel in different pods, created dynamically using "script".

See an equivalent Jenkinsfile attached.

The K8S cluster is composed of 1 master and 14 minion (2CPU / 16GB memory each). The pod template specifies resources limit with CPU=1.75.

Bottom line: there is not enough resources in the cluster to schedule all stages at once.

I'm expecting the non scheduled pods to be waiting for resources to be available in the cluster.

However this is not what I'm seeing.

At the first all the non scheduled pods are visible in PENDING state, but after a while they are deleted.

The result is that the remaining pipeline jobs are waiting for deleted pods to come-up ; i.e the pipeline will never complete.

The Jenkins log has a lot of lines like these:

INFO: Waiting for Pod to be scheduled (25/100): mypod-4cg34-5vpjx

After a bit of search I found this piece of code: https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesLauncher.java#L114

It seems that there is an hard-coded timeout of 100*6 = 600 seconds for waiting on any given pod to be in the "RUNNING" state.

How is this supposed to work when the cluster does not have enough resources to spawn all required pods ?

Are the non scheduled pods supposed to be rescheduled at a later time by Jenkins ?

If not, should that timeout be configureable ?

Thanks,

Romain

Jenkinsfile

Carlos Sanchez

unread,

Jun 25, 2018, 8:28:09 PM6/25/18

to Jenkins Users

the k8s plugin will wait for 600 secs only, but jenkins would call the kubernetes plugin multiple times until an agent with the specified label is available

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/7c2be20f-ef63-48fe-bedb-fa0a59d113b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Romain Grécourt

unread,

Jun 26, 2018, 7:56:06 PM6/26/18

to jenkins...@googlegroups.com

Looks like this was caused by some bad label in my Jenkinsfile.

The pod template defined a label "mypod", and the stages created dynamically specified "mypod-${job}". (there was a typo in the file I attached, s/pod/job/g)

I changed that so that the label is fixed to "mypod", and I'm now seeing the expected behavior, queued jobs are scheduled when resources are available and the pipeline completes.

Is there any implication in having a fixed label ?

E.g., Different git branches with Jenkinfile where the pod template is different but the label value is the same.

On Mon, Jun 25, 2018 at 5:28 PM, Carlos Sanchez <car...@apache.org> wrote:

the k8s plugin will wait for 600 secs only, but jenkins would call the kubernetes plugin multiple times until an agent with the specified label is available

On Mon, Jun 25, 2018 at 7:47 PM Romain Grécourt <romain....@gmail.com> wrote:

Hi,

I'm doing a Jenkins pipeline with the kubernetes plugin (1.8.2).
I have stages running in parallel in different pods, created dynamically using "script".

See an equivalent Jenkinsfile attached.

The K8S cluster is composed of 1 master and 14 minion (2CPU / 16GB memory each). The pod template specifies resources limit with CPU=1.75.

Bottom line: there is not enough resources in the cluster to schedule all stages at once.
I'm expecting the non scheduled pods to be waiting for resources to be available in the cluster.

However this is not what I'm seeing.
At the first all the non scheduled pods are visible in PENDING state, but after a while they are deleted.
The result is that the remaining pipeline jobs are waiting for deleted pods to come-up ; i.e the pipeline will never complete.

The Jenkins log has a lot of lines like these:

INFO: Waiting for Pod to be scheduled (25/100): mypod-4cg34-5vpjx

After a bit of search I found this piece of code: https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesLauncher.java#L114
It seems that there is an hard-coded timeout of 100*6 = 600 seconds for waiting on any given pod to be in the "RUNNING" state.

How is this supposed to work when the cluster does not have enough resources to spawn all required pods ?

Are the non scheduled pods supposed to be rescheduled at a later time by Jenkins ?
If not, should that timeout be configureable ?

Thanks,
Romain

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/7c2be20f-ef63-48fe-bedb-fa0a59d113b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "Jenkins Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CALHFn6Pz9%2Buz04ffLWAnoYgXT1mr4OZxW%2B%3DUDbozUcWYzFKZ%3DA%40mail.gmail.com.

Carlos Sanchez

unread,

Jun 28, 2018, 5:43:34 AM6/28/18

to Jenkins Users

On Tue, Jun 26, 2018 at 7:56 PM Romain Grécourt <romain....@gmail.com> wrote:

Looks like this was caused by some bad label in my Jenkinsfile.

The pod template defined a label "mypod", and the stages created dynamically specified "mypod-${job}". (there was a typo in the file I attached, s/pod/job/g)
I changed that so that the label is fixed to "mypod", and I'm now seeing the expected behavior, queued jobs are scheduled when resources are available and the pipeline completes.

Is there any implication in having a fixed label ?
E.g., Different git branches with Jenkinfile where the pod template is different but the label value is the same.

yes, you should use UUIDs as labels, there are examples in the README

On Mon, Jun 25, 2018 at 5:28 PM, Carlos Sanchez <car...@apache.org> wrote:

the k8s plugin will wait for 600 secs only, but jenkins would call the kubernetes plugin multiple times until an agent with the specified label is available

On Mon, Jun 25, 2018 at 7:47 PM Romain Grécourt <romain....@gmail.com> wrote:

Hi,

I'm doing a Jenkins pipeline with the kubernetes plugin (1.8.2).
I have stages running in parallel in different pods, created dynamically using "script".

See an equivalent Jenkinsfile attached.

The K8S cluster is composed of 1 master and 14 minion (2CPU / 16GB memory each). The pod template specifies resources limit with CPU=1.75.

Bottom line: there is not enough resources in the cluster to schedule all stages at once.
I'm expecting the non scheduled pods to be waiting for resources to be available in the cluster.

However this is not what I'm seeing.
At the first all the non scheduled pods are visible in PENDING state, but after a while they are deleted.
The result is that the remaining pipeline jobs are waiting for deleted pods to come-up ; i.e the pipeline will never complete.

The Jenkins log has a lot of lines like these:

INFO: Waiting for Pod to be scheduled (25/100): mypod-4cg34-5vpjx

After a bit of search I found this piece of code: https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesLauncher.java#L114
It seems that there is an hard-coded timeout of 100*6 = 600 seconds for waiting on any given pod to be in the "RUNNING" state.

How is this supposed to work when the cluster does not have enough resources to spawn all required pods ?

Are the non scheduled pods supposed to be rescheduled at a later time by Jenkins ?
If not, should that timeout be configureable ?

Thanks,
Romain

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/7c2be20f-ef63-48fe-bedb-fa0a59d113b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CALHFn6Pz9%2Buz04ffLWAnoYgXT1mr4OZxW%2B%3DUDbozUcWYzFKZ%3DA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CADhPohnOvx-hnEvBvrNvN-A2ya%2BJj16OvR%2BuXs%3DamRmQuKRwHg%40mail.gmail.com.

Reply all

Reply to author

Forward

0 new messages