[JIRA] (JENKINS-53683) Parallel pod-provisioning fails for distinct pod templates

2 views
Skip to first unread message

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 9:44:03 AM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun created an issue
 
Jenkins / Bug JENKINS-53683
Parallel pod-provisioning fails for distinct pod templates
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Attachments: failure_jnlp_agent.txt, jenkins_log.txt, kubernetes_pod_template_docker.PNG, kubernetes_pod_template_docker_volumes.PNG
Components: kubernetes-plugin
Created: 2018-09-20 13:43
Environment: Openshift Master: 3.9.0
Kubernetes Master: 1.9.1
Jenkins: 2.138.1
Kubernetes Plugin: 1.12.6
Priority: Major Major
Reporter: Fabian Braun

When we start two jenkins jobs simultaneously which make use of different pod templates the pod-provisioning fails.
Openshift tries to create the pods, but either they are terminated immediately or they end up in error state with all containers running except jnlp (ConnectionRefusalException: Unknown client name).
This happens continuously until the container cap is reached or until one of the two builds in the waiting queue is aborted manually.
Everything works fine if the two jobs use the same pod-template, we only observe this behavior for different pod templates.
The attached logs demonstrate the behavior:
The two pods, which are provisioned first are:

  • docker-4h6km (kubernetes pod template name "docker")
  • jenkinsslave-rzk3k-wr4wz (kubernetes pod template name "jenkinsslave")

The former is configured on the jenkins-UI global configuration, the latter is configured directly in the pipeline-script.
The pod-templates have different names and labels but contain both a container named "docker".
Strangely, in the jenkins-log we observe a KubernetesSlave _terminate on the docker-pod, directly after the provisioning:

INFO: Waiting for Pod to be scheduled (1/100): docker-4h6km
Sep 20, 2018 12:14:28 PM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
Sep 20, 2018 12:14:28 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
INFO: Terminating Kubernetes instance for agent docker-4h6km
Sep 20, 2018 12:14:28 PM okhttp3.internal.platform.Platform log

The subsequent errors may be caused by this early termination of the pod?

attached files:

  • jenkins_log.txt - log of the jenkins-master
  • failure_jnlp_agent.txt - log of the jnlp-agent in one of the failed pods
  • kubernetes_pod_template_docker.png - configuration of the docker-pod on jenkins UI
  • kubernetes_pod_template_docker_volumes.png - volume configuration of the docker-pod on the jenkins UI
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 10:04:02 AM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
When we start two jenkins jobs simultaneously which make use of different pod templates the pod-provisioning fails.
Openshift tries to create the pods, but either they are terminated immediately or they end up in error state with all containers running except jnlp (ConnectionRefusalException: Unknown client name).
This happens continuously until the container cap is reached or until one of the two builds in the waiting queue is aborted manually.
Everything works fine if the two jobs use the same pod-template, we only observe this behavior for different pod templates.
The attached logs demonstrate the behavior:
The two pods, which are provisioned first are:
* docker-4h6km (kubernetes pod template name "docker")
* jenkinsslave-rzk3k-wr4wz (kubernetes pod template name "jenkinsslave")


The former is configured on the jenkins-UI global configuration, the latter is configured directly in the pipeline-script.
The pod-templates have different names and labels but contain both a container named "docker".
Strangely, in the jenkins-log we observe a {{KubernetesSlave _terminate}} on the docker-pod, directly after the provisioning:

{code}

INFO: Waiting for Pod to be scheduled (1/100): docker-4h6km
Sep 20, 2018 12:14:28 PM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
Sep 20, 2018 12:14:28 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
INFO: Terminating Kubernetes instance for agent docker-4h6km
Sep 20, 2018 12:14:28 PM okhttp3.internal.platform.Platform log
{code}


The subsequent errors may be caused by this early termination of the pod?

attached files:
* jenkins_log.txt - log of the jenkins-master
* failure_jnlp_agent.txt - log of the jnlp-agent in one of the failed pods
* kubernetes_pod_template_docker.png - configuration of the docker-pod on jenkins UI
* kubernetes_pod_template_docker_volumes.png - volume configuration of the docker-pod on the jenkins UI


We have two kubernetes-masters + four nodes

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Sep 20, 2018, 10:48:03 AM9/20/18
to jenkinsc...@googlegroups.com
Carlos Sanchez commented on Bug JENKINS-53683
 
Re: Parallel pod-provisioning fails for distinct pod templates

can you provide the master logs at DEBUG level for the plugin?

For more detail, configure a new Jenkins log recorder for org.csanchez.jenkins.plugins.kubernetes at ALL level.

 

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 11:18:01 AM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
Attachment: jenkins_master_log_ALL.txt

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 11:21:01 AM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun commented on Bug JENKINS-53683
 
Re: Parallel pod-provisioning fails for distinct pod templates

Hi, thanks for the quick response. I thought I had done that actually
To be certain I just produced another master-log-file with the additional logger for the kubernetes-plugin.
Look out for jenkinsslave-rzk3k-skgw1 and docker-jldkv in this new file (jenkins_master_log_ALL.txt).

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Sep 20, 2018, 11:24:01 AM9/20/18
to jenkinsc...@googlegroups.com

I don't see any DEBUG statement in the log

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 12:03:02 PM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
Attachment: kubernetes_plugin_log.txt

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 20, 2018, 12:13:01 PM9/20/18
to jenkinsc...@googlegroups.com
Fabian Braun commented on Bug JENKINS-53683
 
Re: Parallel pod-provisioning fails for distinct pod templates

Sorry, exported the logs from STDOUT instead of the jenkins-UI.
Here is the file copied from the jenkins-UI: kubernetes_plugin_log.txt

However, DEBUG-statements are not present. Only "FINE" and "FINER". Is that sufficient? Or do I need to configure debug-logging in jenkins' logging.properties aswell?

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 21, 2018, 7:43:21 AM9/21/18
to jenkinsc...@googlegroups.com

Ok, we found the root-cause of the error:

Our instance-cap configuration was set to 6 on the kubernetes-cloud-configuration. The two pod-templates required in total 6 containers (4 for the jenkinsslave-pod, 2 for the docker-pod). I guess that reaching the container-cap triggered the immediate deletion of the pods. That's why it was working for two docker-pods simultaneously (which required only 4 containers in total).

We removed the instance-cap from the cloud-config and now the pods are created simultaneously without problem.

Feel free to close this issue.

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Sep 21, 2018, 8:11:01 AM9/21/18
to jenkinsc...@googlegroups.com

That can't be the issue because instance cap is for pods, not containers, and doesn't kill pods after they have been started

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 21, 2018, 9:07:02 AM9/21/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
Attachment: kubernetes_cloud_config.PNG

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 21, 2018, 10:32:02 AM9/21/18
to jenkinsc...@googlegroups.com
Fabian Braun commented on Bug JENKINS-53683
 
Re: Parallel pod-provisioning fails for distinct pod templates

Indeed, we're observing the behavior again. If we launch sufficiently many jobs simultaneously it also happens when they're all based on the same pod template.

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 21, 2018, 10:35:03 AM9/21/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
When we start two jenkins jobs simultaneously which make use of different pod templates the pod-provisioning fails.
Openshift tries to create the pods, but either they are terminated immediately or they end up in error state with all containers running except jnlp (ConnectionRefusalException: Unknown client name).
This happens continuously until the container cap is reached or until one of the two builds in the waiting queue is aborted manually.
- Everything works fine if the two jobs use the same pod-template, we only observe this behavior for different pod templates. -
The attached logs demonstrate the behavior:
The two pods, which are provisioned first are:
* docker-4h6km (kubernetes pod template name "docker")
* jenkinsslave-rzk3k-wr4wz (kubernetes pod template name "jenkinsslave")

The former is configured on the jenkins-UI global configuration, the latter is configured directly in the pipeline-script.
The pod-templates have different names and labels but contain both a container named "docker".
Strangely, in the jenkins-log we observe a {{KubernetesSlave _terminate}} on the docker-pod, directly after the provisioning:

{code}
INFO: Waiting for Pod to be scheduled (1/100): docker-4h6km
Sep 20, 2018 12:14:28 PM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
Sep 20, 2018 12:14:28 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
INFO: Terminating Kubernetes instance for agent docker-4h6km
Sep 20, 2018 12:14:28 PM okhttp3.internal.platform.Platform log
{code}

The subsequent errors may be caused by this early termination of the pod?

attached files:
* jenkins_log.txt - log of the jenkins-master
* failure_jnlp_agent.txt - log of the jnlp-agent in one of the failed pods
* kubernetes_pod_template_docker.png - configuration of the docker-pod on jenkins UI
* kubernetes_pod_template_docker_volumes.png - volume configuration of the docker-pod on the jenkins UI

We have two kubernetes-masters + four nodes

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 26, 2018, 3:19:02 AM9/26/18
to jenkinsc...@googlegroups.com
 
Re: Parallel pod-provisioning fails for distinct pod templates

Ok, I believe to have found the root cause (once again..)
The comment by Aiman Alsari here title seems to be describing the same problem. I have tried his workaround of setting idleMinutes: 1 in the pod templates and now I'm able to run more than 6 jobs in parallel, using different pod templates.

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 26, 2018, 3:21:02 AM9/26/18
to jenkinsc...@googlegroups.com
Fabian Braun edited a comment on Bug JENKINS-53683
Ok, I believe to have found the root cause (once again..)
The comment by Aiman Alsari [here title |https://issues.jenkins-ci.org/browse/JENKINS-47501?focusedCommentId=336622&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336622] seems to be describing the same problem. I have tried his workaround of setting {{idleMinutes: 1}} in the pod templates and now I'm able to run more than 6 jobs in parallel, using different pod templates.

fabian-braun+jenkins@mailbox.org (JIRA)

unread,
Sep 27, 2018, 3:40:02 AM9/27/18
to jenkinsc...@googlegroups.com
Fabian Braun updated an issue
Change By: Fabian Braun
Attachment: jenkins_master_log_ALL.txt

jglick@cloudbees.com (JIRA)

unread,
Jul 16, 2019, 3:43:17 PM7/16/19
to jenkinsc...@googlegroups.com
Jesse Glick assigned an issue to Unassigned
Change By: Jesse Glick
Assignee: Carlos Sanchez
Reply all
Reply to author
Forward
0 new messages