[JIRA] (JENKINS-59910) Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

oxygenxo@gmail.com (JIRA)

unread,

Oct 23, 2019, 5:34:03 PM10/23/19

to jenkinsc...@googlegroups.com

Andrey Babushkin created an issue

Jenkins /

JENKINS-59910

Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Issue Type:	Bug
Assignee:	Unassigned
Components:	core, kubernetes-plugin
Created:	2019-10-23 21:33
Environment:	Official Docker image jenkins/jenkins:2.190.1-jdk11 No HTTPS enabled Ubuntu 18.04
Priority:	Critical
Reporter:	Andrey Babushkin

Investigating a spike in builds queue size we've found out that TcpSlaveAgent listener thread was dead with the following logs:

 
                                                                2019-10-23 09:02:17.236+0000 [id=200815]        SEVERE  h.TcpSlaveAgentListener$ConnectionHandler#lambda$new$0: Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread[TCP agent connection handler #1715 with /10.125.100.99:47700,5,main]
java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
        at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
        at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
        at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
        at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
        at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
        at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
        at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271)
2019-10-23 09:02:17.237+0000 [id=200815]        WARNING hudson.TcpSlaveAgentListener$1#run: Connection handler failed, restarting listener
java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
        at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
        at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
        at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
        at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
        at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
        at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
        at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
        at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271)  
                                                            

Followed by logs from nodes created by Jenkins Kubernetes Plugin:

 
                                                                SEVERE: http://jenkins-master.example.com/ provided port:50000 is not reachable
java.io.IOException: http://jenkins-master.example.com/ provided port:50000 is not reachable
        at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:287)
        at hudson.remoting.Engine.innerRun(Engine.java:523)
        at hudson.remoting.Engine.run(Engine.java:474)
  
                                                            

Changing JNLP port from 50000 to 50001 and back in Jenkins settings helped to restore connection and then nodes were able to connect to master again.

A few questions:

How can I debug this further?
Can it be an issue with Jenkins 2.190.1? (We've faced this twice after upgrade from previous LTS in September)
Is there some way to notify administrator about such things in logs?

Add Comment

This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)

oxygenxo@gmail.com (JIRA)

unread,

Oct 29, 2019, 7:44:02 AM10/29/19

to jenkinsc...@googlegroups.com

Andrey Babushkin commented on

JENKINS-59910

Re: Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

It didn't occur after rollback to Jenkins 2.176.4

Add Comment

oxygenxo@gmail.com (JIRA)

unread,

Apr 23, 2020, 6:51:03 PM4/23/20

to jenkinsc...@googlegroups.com

Andrey Babushkin updated an issue

Jenkins /

JENKINS-59910

Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Change By:	Andrey Babushkin
Environment:

Official Docker image jenkins/jenkins:2.190.1-jdk11

No HTTPS enabled Both with and without Nginx 1.17.6 as reverse proxy
Ubuntu 18.04

Add Comment

This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)

oxygenxo@gmail.com (JIRA)

unread,

Apr 23, 2020, 6:51:04 PM4/23/20

to jenkinsc...@googlegroups.com

Andrey Babushkin updated an issue

Jenkins /

JENKINS-59910

Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Change By:	Andrey Babushkin
Environment:	Official Docker image based on jenkins/jenkins:2. 190 204 . 1 5 -jdk11

Both with and without Nginx 1.17.6 as reverse proxy
Ubuntu 18.04

Add Comment

oxygenxo@gmail.com (JIRA)

unread,

Apr 23, 2020, 6:51:05 PM4/23/20

to jenkinsc...@googlegroups.com

Andrey Babushkin commented on

JENKINS-59910

Re: Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Nope it DID occur and still occur sometimes, we on Jenkins 2.204.5 now.

Jesse Glick could you please give me some hints of how to debug this further to find the root cause?

Add Comment

oxygenxo@gmail.com (JIRA)

unread,

Apr 23, 2020, 6:52:05 PM4/23/20

to jenkinsc...@googlegroups.com

Andrey Babushkin updated an issue

Jenkins /

JENKINS-59910

Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

Change By:	Andrey Babushkin

Investigating a spike in builds queue size we've found out that TcpSlaveAgent listener thread was dead with the following logs:

{code:java}
2019-10-23 09:02:17.236+0000 [id=200815] SEVERE h.TcpSlaveAgentListener$ConnectionHandler#lambda$new$0: Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread[TCP agent connection handler #1715 with /10.125.100.99:47700,5,main]

at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271) {code}

Followed by logs from nodes created by Jenkins Kubernetes Plugin:

{code:java}

SEVERE: http://jenkins-master.example.com/ provided port:50000 is not reachable
java.io.IOException: http://jenkins-master.example.com/ provided port:50000 is not reachable
        at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:287)
        at hudson.remoting.Engine.innerRun(Engine.java:523)
        at hudson.remoting.Engine.run(Engine.java:474)

{code}

Changing JNLP port from 50000 to 50001 and back in Jenkins settings helped to restore connection and then nodes were able to connect to master again.

A few questions:
# How can I debug this further?
# Can it be an issue with Jenkins 2.190.1? (We've faced this twice after upgrade from previous LTS in September)
# Is there some way to notify administrator about such things in logs?