[JIRA] (JENKINS-54988) jenkins queue is locked (not able to start any build) because KubernetesSlave._terminate stuck

0 views
Skip to first unread message

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 3, 2018, 12:39:03 PM12/3/18
to jenkinsc...@googlegroups.com
Alexey Grigorov created an issue
 
Jenkins / Bug JENKINS-54988
jenkins queue is locked (not able to start any build) because KubernetesSlave._terminate stuck
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Components: kubernetes-plugin
Created: 2018-12-03 17:38
Priority: Critical Critical
Reporter: Alexey Grigorov

Hi guys apparently this call can hang forever and it will hold jenkins queue in locked state so no build will be able to start.  
https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesSlave.java#L262

Here is a stacktrace from a threadddump

"jenkins.util.Timer [#6] / waiting for JNLP4-connect connection from 10.77.102.75/10.77.102.75:57072 id=77127042" daemon prio=5 TIMED_WAITING
	java.lang.Object.wait(Native Method)
	hudson.remoting.Request.call(Request.java:177)
	hudson.remoting.Channel.call(Channel.java:954)
	org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave._terminate(KubernetesSlave.java:236)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:840)
	hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:121)
	hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:46)
	hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:207)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:190)
	jenkins.model.Jenkins.updateComputerList(Jenkins.java:1552)
	jenkins.model.Nodes$6.run(Nodes.java:261)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	jenkins.model.Nodes.removeNode(Nodes.java:252)
	jenkins.model.Jenkins.removeNode(Jenkins.java:2066)
	hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59)
	hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43)
	hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
	hudson.model.Queue._withLock(Queue.java:1380)
	hudson.model.Queue.withLock(Queue.java:1257)
	hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
	hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
	jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
	java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	java.lang.Thread.run(Thread.java:748)
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 3, 2018, 12:42:02 PM12/3/18
to jenkinsc...@googlegroups.com

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 3, 2018, 12:46:02 PM12/3/18
to jenkinsc...@googlegroups.com
Alexey Grigorov updated an issue
Hi guys apparently this call can hang forever and it will hold jenkins queue in locked state so no build will be able to start.  
[https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesSlave.java#L262]

Here is a stacktrace from a threadddump
{noformat}
{noformat}

!2018-12-01 23_13_46-Jenkins.png|thumbnail!

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 3, 2018, 12:46:02 PM12/3/18
to jenkinsc...@googlegroups.com

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 4, 2018, 8:23:03 AM12/4/18
to jenkinsc...@googlegroups.com
Alexey Grigorov updated an issue
Hi guys apparently this call can hang forever and it will hold jenkins queue in locked state so no build will be able to start.  
[https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesSlave.java#L262]

Here is a stacktrace from a threadddump and a screenshot

alexeygrigorov@gmail.com (JIRA)

unread,
Dec 4, 2018, 8:24:03 AM12/4/18
to jenkinsc...@googlegroups.com
Alexey Grigorov updated an issue
Hi guys apparently this call can hang forever and it will hold jenkins queue in locked state so no build will be able to start.  
[https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/KubernetesSlave.java#L262]

Here is a stacktrace from a threadddump threaddump and a screenshot

jglick@cloudbees.com (JIRA)

unread,
Jul 16, 2019, 3:43:46 PM7/16/19
to jenkinsc...@googlegroups.com

jglick@cloudbees.com (JIRA)

unread,
Oct 30, 2019, 12:57:03 PM10/30/19
to jenkinsc...@googlegroups.com

jglick@cloudbees.com (JIRA)

unread,
Oct 31, 2019, 11:36:02 AM10/31/19
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages