[JIRA] (JENKINS-59579) EC2 Plugin stops slave when build is running

5 views
Skip to first unread message

cimpoiesgeorge@gmail.com (JIRA)

unread,
Oct 1, 2019, 5:07:02 AM10/1/19
to jenkinsc...@googlegroups.com
George Cimpoies updated an issue
 
Jenkins / Bug JENKINS-59579
EC2 Plugin stops slave when build is running
Change By: George Cimpoies
Summary: Amazon EC2 Plugin stops slave idle timeout, where to find it and how to increase it? when build is running
Issue Type: New Feature Bug
Priority: Major Blocker
I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config:   !Capture.PNG!!Capture2.PNG!!Capture3.PNG|width=1016,height=736!

 

The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements).

 

Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following:

 
Slave log:
{code:java}
12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected.
12:49:46  at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)
12:49:46  at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
12:49:46  at com.sun.proxy.$Proxy89.isAlive(Unknown Source)
12:49:46  at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)
12:49:46  at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)
12:49:46  at hudson.Launcher$ProcStarter.join(Launcher.java:492)
12:49:46  at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)
12:49:46  at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)
12:49:46  at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
12:49:46  at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
12:49:46  at hudson.model.Build$BuildExecution.build(Build.java:206)
12:49:46  at hudson.model.Build$BuildExecution.doRun(Build.java:163)
12:49:46  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
12:49:46  at hudson.model.Run.execute(Run.java:1815)
12:49:46  at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
12:49:46  at hudson.model.ResourceController.execute(ResourceController.java:97)
12:49:46  at hudson.model.Executor.run(Executor.java:429)
12:49:46 Caused by: java.nio.channels.ClosedChannelException
12:49:46  at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
12:49:46  at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
12:49:46  at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)
12:49:46  at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
12:49:46  at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
12:49:46  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
12:49:46  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
12:49:46  at java.lang.Thread.run(Thread.java:748)
{code}
On the master's log I can see:
{code:java}
Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING
Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout
EC2 instance idle time expired: i-000908b57bb5d82a7
Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
Terminated EC2 instance (terminated): i-000908b57bb5d82a7
Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException
Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute
aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE
Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
Removed EC2 instance from jenkins master: i-000908b57bb5d82a7
{code}
Any help in tracking down After that period of time the problem slave is much appreciated disconnected even though the build was running on it, resulting in:

 
*15:40:46* FATAL: command execution failed*15:40:46* java
. io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.176/10.230.0.176:49733' is disconnected.*15:40:46*  at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)*15:40:46*  at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)*15:40:46*  at com.sun.proxy.$Proxy89.isAlive(Unknown Source)*15:40:46*  at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)*15:40:46*  at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)*15:40:46*  at hudson.Launcher$ProcStarter.join(Launcher.java:492)*15:40:46*  at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)*15:40:46*  at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)*15:40:46*  at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)*15:40:46*  at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)*15:40:46*  at hudson.model.Build$BuildExecution.build(Build.java:206)*15:40:46*  at hudson.model.Build$BuildExecution.doRun(Build.java:163)*15:40:46*  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)*15:40:46*  at hudson.model.Run.execute(Run.java:1815)*15:40:46*  at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)*15:40:46*  at hudson.model.ResourceController.execute(ResourceController.java:97)*15:40:46*  at hudson.model.Executor.run(Executor.java:429)*15:40:46*        Caused by: java.nio.channels.ClosedChannelException*15:40:46*  at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)*15:40:46*  at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)*15:40:46*  at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)*15:40:46*  at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)*15:40:46*  at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)*15:40:46*  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*15:40:46*  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*15:40:46*  at java.lang.Thread.run(Thread.java:748)
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

cimpoiesgeorge@gmail.com (JIRA)

unread,
Oct 1, 2019, 5:08:02 AM10/1/19
to jenkinsc...@googlegroups.com
George Cimpoies updated an issue
After that period of time the slave is disconnected even though the build was running on it , resulting in:

 
*15:40:46* FATAL: command execution failed*15:40:46* java
. io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.176/10.230.0.176:49733' Any help in tracking down the problem is disconnected.*15:40:46*  at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)*15:40:46*  at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)*15:40:46*  at com.sun.proxy.$Proxy89.isAlive(Unknown Source)*15:40:46*  at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)*15:40:46*  at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)*15:40:46*  at hudson.Launcher$ProcStarter.join(Launcher.java:492)*15:40:46*  at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)*15:40:46*  at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)*15:40:46*  at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)*15:40:46*  at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)*15:40:46*  at hudson.model.Build$BuildExecution.build(Build.java:206)*15:40:46*  at hudson.model.Build$BuildExecution.doRun(Build.java:163)*15:40:46*  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)*15:40:46*  at hudson.model.Run.execute(Run.java:1815)*15:40:46*  at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)*15:40:46*  at hudson.model.ResourceController.execute(ResourceController.java:97)*15:40:46*  at hudson.model.Executor.run(Executor.java:429)*15:40:46*        Caused by: java.nio.channels.ClosedChannelException*15:40:46*  at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)*15:40:46*  at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)*15:40:46*  at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)*15:40:46*  at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)*15:40:46*  at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)*15:40:46*  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*15:40:46*  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*15:40:46*  at java.lang.Thread.run(Thread.java:748) much appreciated!

cimpoiesgeorge@gmail.com (JIRA)

unread,
Oct 1, 2019, 7:32:04 AM10/1/19
to jenkinsc...@googlegroups.com
George Cimpoies closed an issue as Not A Defect
Change By: George Cimpoies
Status: Open Closed
Resolution: Not A Defect
Reply all
Reply to author
Forward
0 new messages