Key: JENKINS-9576
URL: https://issues.jenkins-ci.org/browse/JENKINS-9576
Project: Jenkins
Issue Type: Bug
Components: core
Environment: Host and slaves: Win XP
Reporter: aleksas
Fix For: current
Jenkins host sees node(s) as offline but node services are still running on slaves.
Exception from node details page:
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.readFully(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(Unknown Source)
at java.io.ObjectInputStream.readUTF(Unknown Source)
at java.io.ObjectStreamClass.readNonProxy(Unknown Source)
at java.io.ObjectInputStream.readClassDescriptor(Unknown Source)
at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source)
at java.io.ObjectInputStream.readClassDesc(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:992)
Exception from job build log:
FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Socket closed
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Socket closed
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:643)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy43.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:850)
at hudson.Launcher$ProcStarter.join(Launcher.java:336)
at hudson.plugins.msbuild.MsBuildBuilder.perform(MsBuildBuilder.java:130)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:662)
at hudson.model.Build$RunnerImpl.build(Build.java:177)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:429)
at hudson.model.Run.run(Run.java:1374)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Socket closed
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:694)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1016)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.readFully(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(Unknown Source)
at java.io.ObjectInputStream.readUTF(Unknown Source)
at java.io.ObjectStreamClass.readNonProxy(Unknown Source)
at java.io.ObjectInputStream.readClassDescriptor(Unknown Source)
at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source)
at java.io.ObjectInputStream.readClassDesc(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:992)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
aleksas commented on JENKINS-9576:
----------------------------------
Problem seems to occur when both XP slaves are running. This could be a VM cloning issue, not Jenkins. Will close the issue if this will be confirmed after several checks.
John Muczynski commented on JENKINS-9576:
-----------------------------------------
I'm getting these socket errors.
I'm running two VM's: XP and Win7
Increasing the process priority of the slave processes inside the VM to RealTime, didn't help.
Upgrading to Jenkins ver. 1.413 didn't fix the problem.
I've increased the process priority of the Jenkins JVM in the host OS, and will wait and see what that'll do ...
Is there any control for the socket ping timeout period? I'd like to set it to 5 minutes ...
John Muczynski commented on JENKINS-9576:
-----------------------------------------
My VM running XP has performed a few 6 hour runs without issue, but the VM running win7 has had a failure.
Increasing the process priorities may have traded the socket exception for "java.io.IOException: Unexpected termination of the channel".
By the way, the win7 host OS wouldn't let me increase the priority of the Jenkins JVM to RealTime, but rather only to High.
I noticed that when the RealTime slave process in the VM does certain operations (probably running Ant) then it creates another JVM process named java.exe -- that process isn't RealTime, but rather normal.
So, which process gets pinged by Jenkins master?
I'll try more jobs on the win7 VM and see if the problem is persistent.
By the way, these jobs have long-running .exe programs spawned from Ant.
John Muczynski commented on JENKINS-9576:
-----------------------------------------
I'm still getting "IOException: Unexpected termination of the channel". Both on win7 slaves and win XP slaves. This sounds like JENKINS-6817
jspohr commented on JENKINS-9576:
---------------------------------
I see this every day now, once or twice. Esspecially with longer builds, this is getting increasingly common. I'm running Jenkins 1.446 with 6 build nodes, 2 of them are Windows 7, 64bit. Only the Windows nodes exhibit the issue. I installed the newest JRE, and tried both 32 and 64bit JVMs on the nodes. The server is running Ubuntu 64bit with a 64bit JVM.
The error always happens when a file operation is performed, although this is probably only symptomatic, as all my builds are shell script based:
{quote}
FATAL: Unable to delete script file C:\Users\BLACKP~1\AppData\Local\Temp\hudson1503629514982734336.sh
hudson.util.IOException2: remote file operation failed: C:\Users\BLACKP~1\AppData\Local\Temp\hudson1503629514982734336.sh at hudson.remoting.Channel@651315ad:windows1
at hudson.FilePath.act(FilePath.java:779)
at hudson.FilePath.act(FilePath.java:765)
at hudson.FilePath.delete(FilePath.java:1020)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:697)
at hudson.model.Build$RunnerImpl.build(Build.java:178)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:467)
at hudson.model.Run.run(Run.java:1404)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:238)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:499)
at hudson.remoting.Request.call(Request.java:110)
at hudson.remoting.Channel.call(Channel.java:681)
at hudson.FilePath.act(FilePath.java:772)
... 13 more
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
{quote}
On the client, there is a message box which says
{quote}
java.lang.Exception: The server rejected the connection: windows1 is already connected to this master. Rejecting this connection.
at hudson.remoting.Engine.onConnectionRejected(Engine.java:258)
at hudson.remoting.Engine.run(Engine.java:233)
{quote}
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
jspohr commented on JENKINS-9576:
---------------------------------
During investigation of the windows PCs I found out that both machines were badly configured and suffered from instability due to overheating. So the reason for the SocketExceptions was that the PC crashed. Sorry that I suspected a bug in Jenkins first. I'm still observing the situation, but it looks like this issue has disappeared for me.
Perhaps the one thing Jenkins could improve on is the error report. If it would be possible to detect that a machine is no longer responding after a closed connection, that could give users like me who have their machines locked away out of sight a better clue about what's going on.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
| ||||||||
| ||||
| ||||
| ||||
| ||||
| ||||
| ||||
| ||||
|
||||
|
This message is automatically generated by JIRA. |
| If you think it was sent incorrectly, please contact your JIRA administrators. |