Automatically re-running a remote job without failing it, if the connection to the node drops

11 views
Skip to first unread message

Alexandru Băluț

unread,
May 15, 2019, 10:36:01 AM5/15/19
to Jenkins Users
I see jobs failing with "FATAL: command execution failed" caused by java.io.EOFException. I understand the connection to the remote node drops, so java.io.EOFException is raised. Is it possible to re-run the job remotely without failing it at all? I've seen the Naginator plugin, but it seems to start a new separate job if the current one fails, which is not what I want. The failure would bubble up to the parent job which started it and cause it to fail.

Or is it possible for the parent job to check the status of the child jobs started by it, and "manually" restart the ones failing due to java.io.EOFException until they succeed?


Building remotely on instance-1 (tag1) in workspace /var/lib/jenkins/workspace/eval
[vmu-eval-single] $ /bin/sh -xe /tmp/jenkins4616924287086740166.sh
+ /path/to/evaluation-tool
FATAL: command execution failed
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: java.io.IOException: Backing channel 'instance-1' is disconnected.
at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
at com.sun.proxy.$Proxy78.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1144)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1136)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1816)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
FATAL: Unable to delete script file /tmp/jenkins4616924287086740166.sh
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on instance-eval-worker-25 failed. The channel is closing down or has closed down
at hudson.remoting.Channel.call(Channel.java:950)
at hudson.FilePath.act(FilePath.java:1069)
at hudson.FilePath.act(FilePath.java:1058)
at hudson.FilePath.delete(FilePath.java:1539)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1816)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Build step 'Execute shell' marked build as failure
Finished: FAILURE

Reply all
Reply to author
Forward
0 new messages