[JIRA] (JENKINS-54683) Pods entering error state due to ConnectionRefusalException: Unknown client name

20 views
Skip to first unread message

jenkins-ci@carlossanchez.eu (JIRA)

unread,
Jul 10, 2019, 2:51:33 PM7/10/19
to jenkinsc...@googlegroups.com
Carlos Sanchez updated an issue
 
Jenkins / Bug JENKINS-54683
Pods entering error state due to ConnectionRefusalException: Unknown client name
Change By: Carlos Sanchez
Summary: Pods entering error state due to ConnectionRefusalException: Unknown client name
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

tobias.hinz@gmail.com (JIRA)

unread,
Jul 10, 2019, 4:26:03 PM7/10/19
to jenkinsc...@googlegroups.com

tobias.hinz@gmail.com (JIRA)

unread,
Jul 13, 2019, 8:19:02 AM7/13/19
to jenkinsc...@googlegroups.com

will provide the requested debugging logs by the end of the weekend.

jglick@cloudbees.com (JIRA)

unread,
Jul 16, 2019, 3:44:06 PM7/16/19
to jenkinsc...@googlegroups.com

vavrik.jakub@ifortuna.cz (JIRA)

unread,
Sep 3, 2019, 6:31:05 AM9/3/19
to jenkinsc...@googlegroups.com
Jakub Vavrik commented on Bug JENKINS-54683
 
Re: Pods entering error state due to ConnectionRefusalException: Unknown client name

This behaviour occurs when your job is running with Slave spawned in k8s and some job in the background calls Jenkins.instance.reload(). Jenkins instance forgets that it has a slave of such name and this exception occurs. I'd say either do not call Jenkins.instance.reload() or reload function should be able to not forget running slaves it has created.

vavrik.jakub@ifortuna.cz (JIRA)

unread,
Sep 3, 2019, 6:36:07 AM9/3/19
to jenkinsc...@googlegroups.com
Jakub Vavrik edited a comment on Bug JENKINS-54683
This behaviour occurs when your job is running with Slave spawned in k8s and some job in the background calls Jenkins.instance.reload(). Jenkins instance forgets that it has a slave of such name and this exception occurs. I'd say either do not call Jenkins.instance.reload() or reload function should be able to not forget running slaves it has created.


 

Partial log from master:
{code:java}
Sep 03, 2019 9:49:05 AM io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$3 call
INFO: Updated job web2-bc-admin from BuildConfig NamespaceName{web2:bc-admin} with revision: 356707991
Sep 03, 2019 9:49:05 AM io.fabric8.jenkins.openshiftsync.CredentialsUtils upsertCredential
INFO: Updated credential web2-jenkins-secret-bitbucket from Secret NamespaceName{web2:jenkins-secret-bitbucket} with revision: 171272636
Sep 03, 2019 9:49:05 AM io.fabric8.jenkins.openshiftsync.BuildSyncRunListener onStarted
INFO: starting polling build job/web2/job/web2-bc-admin/3/
Sep 03, 2019 9:49:08 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Excess workload after pending Kubernetes agents: 1
Sep 03, 2019 9:49:08 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Template for label java11-maven35-nodejs8-boosted: Kubernetes Pod Template
Sep 03, 2019 9:49:08 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning Kubernetes Pod Template from openshift with 1 executors. Remaining excess workload: 0
Sep 03, 2019 9:49:18 AM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 2 computer(s)
Sep 03, 2019 9:49:18 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Created Pod: web2/java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:18 AM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Sep 03, 2019 9:49:22 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Pod is running: web2/java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:22 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (0/100): java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:23 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (1/100): java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:24 AM hudson.TcpSlaveAgentListener$ConnectionHandler run
INFO: Accepted JNLP4-connect connection #12 from /10.38.49.11:56832
Sep 03, 2019 9:49:24 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (2/100): java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:25 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (3/100): java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:26 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (4/100): java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:49:27 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Waiting for agent to connect (5/100): java11-maven35-nodejs8-boosted-m3tvb
INFO: Terminating Kubernetes instance for agent java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:58:26 AM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
FATAL: Computer for agent is null: java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:58:26 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
SEVERE: Computer for agent is null: java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:58:26 AM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
WARNING: Computer.threadPoolForRemoting [#109] for java11-maven35-nodejs8-boosted-m3tvb terminated
java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209)
at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:816)
at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:784)
at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314)
at hudson.remoting.Channel.close(Channel.java:1452)
at hudson.remoting.Channel.close(Channel.java:1405)
at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:832)
at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:799)
at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88)
at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:227)
at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1581)
at jenkins.model.Nodes$6.run(Nodes.java:261)
at hudson.model.Queue._withLock(Queue.java:1381)
at hudson.model.Queue.withLock(Queue.java:1258)
at jenkins.model.Nodes.removeNode(Nodes.java:252)
at jenkins.model.Jenkins.removeNode(Jenkins.java:2096)
at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:70)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1$1.run(OnceRetentionStrategy.java:128)
at hudson.model.Queue._withLock(Queue.java:1381)
at hudson.model.Queue.withLock(Queue.java:1258)
at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1.run(OnceRetentionStrategy.java:123)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sep 03, 2019 9:58:26 AM org.jenkinsci.plugins.workflow.job.WorkflowRun finish
INFO: web2/web2-bc-admin #3 completed: FAILURE
Sep 03, 2019 9:58:26 AM io.fabric8.jenkins.openshiftsync.BuildSyncRunListener onCompleted
INFO: onCompleted job/web2/job/web2-bc-admin/3/
  Sep 03, 2019 9:58:26 AM io.fabric8.jenkins.openshiftsync.BuildSyncRunListener onFinalized
INFO: onFinalized job/web2/job/web2-bc-admin/3/
  Sep 03, 2019 9:58:36 AM hudson.TcpSlaveAgentListener$ConnectionHandler run
INFO: Accepted JNLP4-connect connection #13 from /10.38.49.11:57212
Sep 03, 2019 9:58:36 AM org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer onRecv
INFO: [JNLP4-connect connection from 10.38.49.11/10.38.49.11:57212] Refusing headers from remote: Unknown client name: java11-maven35-nodejs8-boosted-m3tvb
Sep 03, 2019 9:58:36 AM hudson.TcpSlaveAgentListener$ConnectionHandler run
WARNING: Connection #14 failed
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:244)

Sep 03, 2019 10:00:12 AM io.fabric8.jenkins.openshiftsync.BuildWatcher eventReceived
WARNING: Caught: java.lang.NullPointerException
java.lang.NullPointerException
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:566)
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:575)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.innerDeleteEventToJenkinsJobRun(BuildWatcher.java:424)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.deleteEventToJenkinsJobRun(BuildWatcher.java:453)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:171)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:187)
at io.fabric8.jenkins.openshiftsync.WatcherCallback.eventReceived(WatcherCallback.java:34)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onMessage(WatchConnectionManager.java:237)
at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:310)
at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:222)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:265)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:204)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:153)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Sep 03, 2019 10:00:12 AM io.fabric8.jenkins.openshiftsync.BuildWatcher eventReceived
WARNING: Caught: java.lang.NullPointerException
java.lang.NullPointerException
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:566)
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:575)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.innerDeleteEventToJenkinsJobRun(BuildWatcher.java:424)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.deleteEventToJenkinsJobRun(BuildWatcher.java:453)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:171)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:187)
at io.fabric8.jenkins.openshiftsync.WatcherCallback.eventReceived(WatcherCallback.java:34)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onMessage(WatchConnectionManager.java:237)
at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:310)
at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:222)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:265)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:204)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:153)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Sep 03, 2019 10:00:12 AM io.fabric8.jenkins.openshiftsync.BuildWatcher eventReceived
WARNING: Caught: java.lang.NullPointerException
java.lang.NullPointerException
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:566)
at io.fabric8.jenkins.openshiftsync.JenkinsUtils.deleteRun(JenkinsUtils.java:575)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.innerDeleteEventToJenkinsJobRun(BuildWatcher.java:424)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.deleteEventToJenkinsJobRun(BuildWatcher.java:453)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:171)
at io.fabric8.jenkins.openshiftsync.BuildWatcher.eventReceived(BuildWatcher.java:187)
at io.fabric8.jenkins.openshiftsync.WatcherCallback.eventReceived(WatcherCallback.java:34)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onMessage(WatchConnectionManager.java:237)
at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:310)
at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:222)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:265)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:204)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:153)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

{code}

tamerlaha@gmail.com (JIRA)

unread,
Sep 19, 2019, 3:50:02 PM9/19/19
to jenkinsc...@googlegroups.com
ipleten commented on Bug JENKINS-54683

We have exact the same problem.

This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages