[JIRA] (JENKINS-59910) Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

47 views
Skip to first unread message

oxygenxo@gmail.com (JIRA)

unread,
Oct 23, 2019, 5:34:03 PM10/23/19
to jenkinsc...@googlegroups.com
Andrey Babushkin created an issue
 
Jenkins / Bug JENKINS-59910
Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread
Issue Type: Bug Bug
Assignee: Unassigned
Components: core, kubernetes-plugin
Created: 2019-10-23 21:33
Environment: Official Docker image jenkins/jenkins:2.190.1-jdk11
No HTTPS enabled
Ubuntu 18.04
Priority: Critical Critical
Reporter: Andrey Babushkin

Investigating a spike in builds queue size we've found out that TcpSlaveAgent listener thread was dead with the following logs:

2019-10-23 09:02:17.236+0000 [id=200815]        SEVERE  h.TcpSlaveAgentListener$ConnectionHandler#lambda$new$0: Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread[TCP agent connection handler #1715 with /10.125.100.99:47700,5,main]
java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
        at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
        at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
        at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
        at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
        at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
        at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
        at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271)
2019-10-23 09:02:17.237+0000 [id=200815]        WARNING hudson.TcpSlaveAgentListener$1#run: Connection handler failed, restarting listener
java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
        at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
        at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
        at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
        at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
        at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
        at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
        at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271) 

Followed by logs from nodes created by Jenkins Kubernetes Plugin:

SEVERE: http://jenkins-master.example.com/ provided port:50000 is not reachable
java.io.IOException: http://jenkins-master.example.com/ provided port:50000 is not reachable
        at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:287)
        at hudson.remoting.Engine.innerRun(Engine.java:523)
        at hudson.remoting.Engine.run(Engine.java:474)
 

Changing JNLP port from 50000 to 50001 and back in Jenkins settings helped to restore connection and then nodes were able to connect to master again.

A few questions:

  1. How can I debug this further?
  2. Can it be an issue with Jenkins 2.190.1? (We've faced this twice after upgrade from previous LTS in September)
  3. Is there some way to notify administrator about such things in logs?
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

oxygenxo@gmail.com (JIRA)

unread,
Oct 29, 2019, 7:44:02 AM10/29/19
to jenkinsc...@googlegroups.com

oxygenxo@gmail.com (JIRA)

unread,
Apr 23, 2020, 6:51:03 PM4/23/20
to jenkinsc...@googlegroups.com
Andrey Babushkin updated an issue
 
Change By: Andrey Babushkin
Environment:
Official Docker image jenkins/jenkins:2.190.1-jdk11
No HTTPS enabled Both with and without Nginx  1.17.6 as reverse proxy
Ubuntu 18.04
This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)
Atlassian logo

oxygenxo@gmail.com (JIRA)

unread,
Apr 23, 2020, 6:51:04 PM4/23/20
to jenkinsc...@googlegroups.com
Andrey Babushkin updated an issue
Change By: Andrey Babushkin
Environment: Official Docker image based on jenkins/jenkins:2. 190 204 . 1 5 -jdk11

Both with and without Nginx  1.17.6 as reverse proxy
Ubuntu 18.04

oxygenxo@gmail.com (JIRA)

unread,
Apr 23, 2020, 6:51:05 PM4/23/20
to jenkinsc...@googlegroups.com
 
Re: Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Nope it DID occur and still occur sometimes, we on Jenkins 2.204.5 now.

Jesse Glick could you please give me some hints of how to debug this further to find the root cause?

oxygenxo@gmail.com (JIRA)

unread,
Apr 23, 2020, 6:52:05 PM4/23/20
to jenkinsc...@googlegroups.com
Andrey Babushkin updated an issue
Change By: Andrey Babushkin
Investigating a spike in builds queue size we've found out that TcpSlaveAgent listener thread was dead with the following logs:
{code:java}
2019-10-23 09:02:17.236+0000 [id=200815]        SEVERE  h.TcpSlaveAgentListener$ConnectionHandler#lambda$new$0: Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread[TCP agent connection handler #1715 with /10.125.100.99:47700,5,main]
        at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271) {code}

Followed by logs from nodes created by Jenkins Kubernetes Plugin:
{code:java}

SEVERE: http://jenkins-master.example.com/ provided port:50000 is not reachable
java.io.IOException: http://jenkins-master.example.com/ provided port:50000 is not reachable
        at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:287)
        at hudson.remoting.Engine.innerRun(Engine.java:523)
        at hudson.remoting.Engine.run(Engine.java:474)
{code}

Changing JNLP port from 50000 to 50001 and back in Jenkins settings helped to restore connection and then nodes were able to connect to master again.


A few questions:
# How can I debug this further?
# Can it be an issue with Jenkins 2.190.1? (We've faced this twice after upgrade from previous LTS in September)
# Is there some way to notify administrator about such things in logs?

jglick@cloudbees.com (JIRA)

unread,
Apr 24, 2020, 2:31:04 PM4/24/20
to jenkinsc...@googlegroups.com

jglick@cloudbees.com (JIRA)

unread,
Apr 24, 2020, 2:31:04 PM4/24/20
to jenkinsc...@googlegroups.com
Jesse Glick commented on Bug JENKINS-59910
 
Re: Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Sorry, I am not familiar with ProtocolStack really. You can try WebSocket mode to see if it behaves any better.

oxygenxo@gmail.com (JIRA)

unread,
Apr 24, 2020, 6:32:04 PM4/24/20
to jenkinsc...@googlegroups.com

Thanks! I'll give it a shot.

For future readers of this ticket Jesse referenced to WebSocket mode introduced in 2.217 https://www.jenkins.io/blog/2020/02/02/web-socket/

Reply all
Reply to author
Forward
0 new messages