Jenkins Agent/Slave on Windows Disconnect Issue

500 views
Skip to first unread message

Vinod Krishna

unread,
May 29, 2020, 5:52:47 PM5/29/20
to Jenkins Users

Hi, 


We have around 10 Jenkins Agents, each running on its own Windows 2016 EC2 instance.  Java_slave is running as a service. The Jenkins master runs on a separate Amazon Linux instance. We are able to establish connectivity between the Master and Agents and jobs are running fine. 

However, for some reason, the Service goes offline at different intervals and comes back online. This is a repeated behavior and we are not able to find many logs from the Windows Event Viewer , except that it Says "Jenkins Slave stopping" . and the service comes back online. We installed NewRelic APM Agent to the server to check the Java metrics and there is minimal Heap consumption. The Java versions of both the Agent and Server are the same ( jdk1.8.0_211).  We are not able to find the root cause of the Service being stopped abruptly and Jobs running on them gets killed.


“"windows agent was marked offline: Connection was broken: java.nio.channels.ClosedChannelException"”


Thanks in advance. 

Vinod

D'raj

unread,
Jun 3, 2020, 1:13:37 AM6/3/20
to Jenkins Users
try increasing aws elb Idle timeout, by default its 60 sec

monger_39

unread,
Jun 4, 2020, 7:07:11 AM6/4/20
to jenkins...@googlegroups.com
have you looked on the agent in the remoting logs ?
I've had (and still have) the same issue. Often I see in the remoting logs on the node an error like
   "Reader thread killed by OutOfMemoryError
  java.lang.OutOfMemoryError: unable to create new native thread
  "
which btw does not necessarily mean ''out of memory". It apparently can also indicate 'unable to create new thread'.
Exact reason(s) for the latter are not 100% clear to me still.
I'm very curious/anxious to have more info here too...

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/f7c35898-4a54-4e1c-b199-97b5bd43db77%40googlegroups.com
.

Vinod Krishna

unread,
Jun 4, 2020, 10:10:00 AM6/4/20
to Jenkins Users
Thanks for the response!

I did check the remoting logs; all I see is below

Jun 04, 2020 1:57:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jun 04, 2020 1:57:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to <myci.example.com>:50000
Jun 04, 2020 1:57:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jun 04, 2020 1:57:27 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 65:f3:2a:9c:fc:ec:55:9f:49:de:49:a0:bf:27:ff:93
Jun 04, 2020 1:57:28 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Jun 04, 2020 1:59:15 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated


There are no logs that say what is triggering the termination of service. However, it comes back online after some time. 
To unsubscribe from this group and stop receiving emails from it, send an email to jenkins...@googlegroups.com.

To view this discussion on the web visit

Vinod Krishna

unread,
Jun 4, 2020, 12:59:26 PM6/4/20
to Jenkins Users
I wonder if that is going to help. The ELB timeout is only good enough for the connections between the 1. Client and ELB and  2. ELB and Backend Instance. In this case, only the Jenkins Master is behind the ALB and the connection between is fine! The Windows Agents mentioned here is not part of the ELB setup, but can be considered as a client connection to the ELB. I can try increasing the timeout, not sure if that is going to help. 

Vinod Krishna

unread,
Jun 8, 2020, 5:55:14 PM6/8/20
to Jenkins Users
Hi,

It looks like increasing the ELB Timeout helped us! Thanks a lot!


On Wednesday, 3 June 2020 01:13:37 UTC-4, D'raj wrote:

Slide

unread,
Jun 8, 2020, 6:05:11 PM6/8/20
to Jenkins User Mailing List
How did you modify this setting? In the Jenkins cloud configuration, or on AWS itself?

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/cdd8e861-e2d5-41b9-8d4e-87f8be076467o%40googlegroups.com.


--
Reply all
Reply to author
Forward
0 new messages