15:31:47 Running on swarm-client-agent.example.com in /var/tmp/jenkins/workspace/path/to/workspace[Pipeline] {[Pipeline] ws15:31:47 Running in /export/home/path/to/ws[Pipeline] {[Pipeline] stage[Pipeline] { (Check out repository)[Pipeline] checkout15:32:42 > git rev-parse --is-inside-work-tree # timeout=10[Pipeline] }15:36:35 ERROR: Execution failed15:36:35 java.nio.channels.ClosedChannelException15:36:35 Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.110.xyz.xyz/10.110.xyz.xyz:4429115:36:35 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1693)15:36:35 at hudson.remoting.Request.call(Request.java:192)15:36:35 at hudson.remoting.Channel.call(Channel.java:907)15:36:35 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281)15:36:35 at com.sun.proxy.$Proxy92.hasGitRepo(Unknown Source)15:36:35 at org.jenkinsci.plugins.gitclient.RemoteGitImpl.hasGitRepo(RemoteGitImpl.java:280)15:36:35 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1068)15:36:35 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)15:36:35 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:113)15:36:35 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:85)15:36:35 at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:75)15:36:35 at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)15:36:35 at hudson.security.ACL.impersonate(ACL.java:260)15:36:35 at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)15:36:35 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)15:36:35 at java.util.concurrent.FutureTask.run(FutureTask.java:266)15:36:35 Caused: hudson.remoting.RequestAbortedException15:36:35 at hudson.remoting.Request.abort(Request.java:329)15:36:35 at hudson.remoting.Channel.terminate(Channel.java:992)15:36:35 at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)15:36:35 at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)15:36:35 at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)15:36:35 at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)15:36:35 at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:172)15:36:35 at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)15:36:35 at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)15:36:35 at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)15:36:35 at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)15:36:35 at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)15:36:35 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)15:36:35 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)15:36:35 at java.lang.Thread.run(Thread.java:748)
2018-02-01 23:30:51 INFO hudson.plugins.swarm.Client main Client.main invoked with: [-name swarm-client-agent.example.com -fsroot /var/tmp/jenkins -master http://jenkins-master.example.com -username <redacted> -password <redacted> -executors 1 -description Dynamically configured Jenkins swarm plugin slave swarm-client-agent.example.com -mode exclusive -deleteExistingClients -disableClientsUniqueId -disableSslVerification]2018-02-01 23:30:52 CONFIG hudson.plugins.swarm.SwarmClient <init> SwarmClient constructed with hudson.plugins.swarm.Options@6f94fa3e2018-02-01 23:30:52 INFO hudson.plugins.swarm.Client run Discovering Jenkins master2018-02-01 23:30:52 CONFIG hudson.plugins.swarm.SwarmClient discoverFromMasterUrl discoverFromMasterUrl() invoked2018-02-01 23:30:52 CONFIG hudson.plugins.swarm.SwarmClient discoverFromMasterUrl Connecting to http://jenkins-master.example.com/ to configure swarm client.SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to no-operation (NOP) logger implementationSLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.2018-02-01 23:30:55 INFO hudson.plugins.swarm.Client run Attempting to connect to http://jenkins-master.example.com/ 17e24806-9248-4b9b-86aa-ce97caa34001 with ID2018-02-01 23:30:55 SEVERE hudson.plugins.swarm.SwarmClient getCsrfCrumb Could not obtain CSRF crumb. Response code: 4042018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main createEngine Setting up agent: swarm-client-agent.example.com2018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener <init> Jenkins agent is running in headless mode.2018-02-01 23:30:56 WARNING hudson.remoting.Engine startEngine No Working Directory. Using the legacy JAR Cache location: /var/tmp/jenkins/.jenkins/cache/jars2018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener status Locating server among [http://jenkins-master.example.com/]2018-02-01 23:30:56 INFO org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve Remoting server accepts the following protocols: [JNLP4-connect, Ping]2018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener status Agent discovery successfulAgent address: jenkins-master.example.comAgent port: 53000Identity: 2e:a8:0f:2c:95:86:84:c4:85:ee:52:d2:38:7b:3f:b52018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener status Handshaking2018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener status Connecting to jenkins-master.example.com:530002018-02-01 23:30:56 INFO hudson.remoting.jnlp.Main$CuiListener status Trying protocol: JNLP4-connect2018-02-01 23:30:57 INFO hudson.remoting.jnlp.Main$CuiListener status Remote identity confirmed: 2e:a8:0f:2c:95:86:84:c4:85:ee:52:d2:38:7b:3f:b52018-02-01 23:30:58 INFO hudson.remoting.jnlp.Main$CuiListener status Connected2018-02-01 23:31:46 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Main.development => null2018-02-01 23:31:46 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Main.timeout => null2018-02-01 23:31:56 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.ProxyConfiguration.DEFAULT_CONNECT_TIMEOUT_MILLIS => null2018-02-01 23:32:33 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.util.Secret.provider => null2018-02-01 23:33:32 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Util.noSymLink => null2018-02-01 23:33:32 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Util.symlinkEscapeHatch => null2018-02-01 23:33:32 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Util.maxFileDeletionRetries => null2018-02-01 23:33:32 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Util.deletionRetryWait => null2018-02-01 23:33:32 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Util.performGCOnFailedDelete => null
2018-02-01 23:34:38 FINER org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Unexpected I/O exceptionjava.io.IOException: Connection timed outat sun.nio.ch.FileDispatcherImpl.read0(Native Method)at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)at sun.nio.ch.IOUtil.read(IOUtil.java:197)at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)at hudson.remoting.SocketChannelStream$1.read(SocketChannelStream.java:35)at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:245)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at hudson.remoting.Engine$1$1.run(Engine.java:94)at java.lang.Thread.run(Thread.java:748)2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.NetworkLayer onRecvClosed [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] RECV Closed2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.FilterLayer onRecvClosed [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] RECV Closed2018-02-01 23:34:38 FINE org.jenkinsci.remoting.protocol.ApplicationLayer onRecvClosed [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] RECV Closed2018-02-01 23:34:38 INFO hudson.remoting.jnlp.Main$CuiListener status Terminated2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.IOHub$IOHubSelectorWatcher run Windows IOHub Watcher for IOHub#1: Selector[keys:0, gen:0] / pool-1-thread-2: Finished2018-02-01 23:34:38 FINE hudson.remoting.jnlp.Main main Engine has died2018-02-01 23:34:38 INFO hudson.plugins.swarm.Client run Retrying in 10 seconds2018-02-01 23:34:38 WARNING hudson.remoting.UserRequest perform LinkageError while performing UserRequest:UserRPCRequest(27,hasGitRepo)java.lang.NoClassDefFoundError: hudson/console/ConsoleAnnotationDescriptorat hudson.util.StreamTaskListener._error(StreamTaskListener.java:166)at hudson.util.StreamTaskListener.error(StreamTaskListener.java:175)at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.hasGitRepo(CliGitAPIImpl.java:259)at hudson.plugins.git.GitAPI.hasGitRepo(GitAPI.java:232)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:922)at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:896)at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:853)at hudson.remoting.UserRequest.perform(UserRequest.java:210)at hudson.remoting.UserRequest.perform(UserRequest.java:53)at hudson.remoting.Request$2.run(Request.java:358)at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at hudson.remoting.Engine$1$1.run(Engine.java:94)at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.ClassNotFoundException: hudson.console.ConsoleAnnotationDescriptorat java.net.URLClassLoader.findClass(URLClassLoader.java:381)at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:171)at java.lang.ClassLoader.loadClass(ClassLoader.java:424)at java.lang.ClassLoader.loadClass(ClassLoader.java:357)... 20 more2018-02-01 23:34:38 FINE hudson.slaves.ChannelPinger$2 onClosed Terminating ping thread for JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:530002018-02-01 23:34:38 FINE hudson.remoting.PingThread run Ping thread for channel hudson.remoting.Channel@5cc1fe53:JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000 is interrupted. Terminating2018-02-01 23:34:38 FINE org.jenkinsci.remoting.protocol.ApplicationLayer doCloseWrite [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Closing SEND2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer processWrite [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] APP ENCODE: 0 bytes2018-02-01 23:34:38 WARNING hudson.remoting.Request$2 run Failed to send back a reply to the request hudson.remoting.Request$2@2d486301hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@5cc1fe53:JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000": channel is already closedat hudson.remoting.Channel.send(Channel.java:672)at hudson.remoting.Request$2.run(Request.java:371)at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at hudson.remoting.Engine$1$1.run(Engine.java:94)at java.lang.Thread.run(Thread.java:748)Caused by: java.nio.channels.ClosedChannelExceptionat org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)... 4 more2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer processWrite [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Handshake status: NOT_HANDSHAKING engine result: Status = CLOSED HandshakeStatus = NOT_HANDSHAKINGbytesConsumed = 0 bytesProduced = 852018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.FilterLayer onRecvClosed [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] RECV Closed2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.FilterLayer doCloseSend [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Closing SEND2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.impl.BIONetworkLayer doCloseSend [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Closing SEND2018-02-01 23:34:38 FINEST org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run [JNLP4-connect connection to jenkins-master.example.com/172.16.106.38:53000] Reader thread stopped2018-02-01 23:34:48 CONFIG hudson.plugins.swarm.SwarmClient discoverFromMasterUrl discoverFromMasterUrl() invoked2018-02-01 23:34:48 CONFIG hudson.plugins.swarm.SwarmClient discoverFromMasterUrl Connecting to http://jenkins-master.example.com/ to configure swarm client.
2018-02-01 23:34:48 FINE hudson.plugins.swarm.SwarmClient createHttpClient createHttpClient() invoked2018-02-01 23:34:48 FINE hudson.plugins.swarm.SwarmClient createHttpClient Setting HttpClient credentials based on options passed2018-02-01 23:36:47 FINER hudson.plugins.swarm.SwarmClient getChildElementString getChildElementString() invoked2018-02-01 23:36:47 FINE hudson.plugins.swarm.Candidate <init> Candidate constructed with url: http://jenkins-master.example.com/, secret: 17e24806-9248-4b9b-86aa-ce97caa340012018-02-01 23:36:47 INFO hudson.plugins.swarm.Client run Attempting to connect to http://jenkins-master.example.com/ 17e24806-9248-4b9b-86aa-ce97caa34001 with ID2018-02-01 23:36:51 SEVERE hudson.plugins.swarm.SwarmClient getCsrfCrumb Could not obtain CSRF crumb. Response code: 4042018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main createEngine Setting up agent: swarm-client-agent.example.com2018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener <init> Jenkins agent is running in headless mode.2018-02-01 23:36:57 WARNING hudson.remoting.Engine startEngine No Working Directory. Using the legacy JAR Cache location: /var/tmp/jenkins/.jenkins/cache/jars2018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Locating server among [http://jenkins-master.example.com/]2018-02-01 23:36:57 INFO org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve Remoting server accepts the following protocols: [JNLP4-connect, Ping]2018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Agent discovery successfulAgent address: jenkins-master.example.comAgent port: 53000Identity: 2e:a8:0f:2c:95:86:84:c4:85:ee:52:d2:38:7b:3f:b52018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Handshaking2018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Connecting to jenkins-master.example.com:530002018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Trying protocol: JNLP4-connect2018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Remote identity confirmed: 2e:a8:0f:2c:95:86:84:c4:85:ee:52:d2:38:7b:3f:b52018-02-01 23:36:57 INFO hudson.remoting.jnlp.Main$CuiListener status Connected2018-02-01 23:37:06 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Main.development => null2018-02-01 23:37:06 CONFIG jenkins.util.SystemProperties getString Property (not found): hudson.Main.timeout => null
Hi Oleg,
Thanks for the quick reply. Before I upgraded Jenkins this past weekend, I was seeing a very similar (but not identical) intermittent failure with Jenkins 2.89.2 LTS (Remoting 3.14) and Swarm Plugin client 3.6 (3.10.2).
My job would start, spin up the VM on AWS, and the swarm client would connect back to the master successfully, check out the Git repository, and run the tests (which take about 2-3 hours). But in the "publish results" stage after the tests, I would frequently see this on the master:
java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:721)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
Caused: hudson.remoting.ChannelClosedException: Remote call on JNLP4-connect connection from 10.110.xyz.xyz/10.110.xyz.xyz:61938 failed. The channel is closing down or has closed down
at hudson.remoting.Channel.call(Channel.java:901)
at hudson.FilePath.act(FilePath.java:986)
Caused: java.io.IOException: remote file operation failed: /export/home/path/to/ws at hudson.remoting.Channel@71abe1cc:JNLP4-connect connection from 10.110.xyz.xyz/10.110.xyz.xyz:61938
at hudson.FilePath.act(FilePath.java:993)
at hudson.FilePath.act(FilePath.java:975)
at hudson.FilePath.mkdirs(FilePath.java:1158)
at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:79)
at org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:67)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1$1.call(SynchronousNonBlockingStepExecution.java:49)
at hudson.security.ACL.impersonate(ACL.java:260)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution$1.run(SynchronousNonBlockingStepExecution.java:46)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Finished: FAILURE
Unfortunately, I don't have the client side logs when running Swarm Plugin client 3.6, since this was before I enabled logging on the client side. I will downgrade the Swarm Plugin client from 3.8 (Remoting 3.16) to 3.6 (Remoting 3.10.2) but keep the logging enabled and update this thread once I hit the problem again and collect those logs.
The really bizarre thing about the old failure mode was that the remote operation that failed was a FilePath.mkdirs call to the workspace itself, which we had been successfully using throughout the duration of the test run. There's no way it would have been inaccessible at the file system level on the swarm client at the end of the test run during the publish results stage (the workspace is on local disk in the VM); rather, it must be that the Remoting connection got itself into a pathological state during the test run and then failed explicitly at the final publish results stage.
With Swarm Plugin client 3.8 (Remoting 3.16), I see the failure at the very beginning of the test run, which in some ways is better (fail fast) but in other ways is worse (my tests don't get run).
If there’s anything else I can do to help debug, or a part of the source code that I should start reading first to understand what’s happening, then let me know.
Thanks,
Basil
Hi all,