First agent is well started, and identicated on the master :
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Locating server among [http://xxxxxxxxxx:8080/]
mars 22, 2018 5:40:04 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFOS: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Agent discovery successful
Agent address: xxxxxxxxxx
Agent port: 9999
Identity: xxxxxxxxxx
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Handshaking
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Connecting to topvm09.sesame.infotel.com:9999
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Trying protocol: JNLP4-connect
mars 22, 2018 5:40:04 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Remote identity confirmed: xxxxxxxxxx
mars 22, 2018 5:40:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Connected
mars 22, 2018 5:40:06 PM com.youdevise.hudson.slavestatus.SlaveListener call
INFOS: Slave-status listener starting
mars 22, 2018 5:40:06 PM com.youdevise.hudson.slavestatus.SocketHTTPListener waitForConnection
INFOS: Slave-status listener ready on port 3141
Then master is unavailable (lots of OutOfMemory) and has been restarted. In the meantime, the JNLP agent try to reconnect to master until connection is OK:
mars 28, 2018 1:49:25 PM hudson.slaves.ChannelPinger$1 onDead
INFOS: Ping failed. Terminating the channel JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999.
java.util.concurrent.TimeoutException: Ping started at 1522237525477 hasn't completed by 1522237765505
at hudson.remoting.PingThread.ping(PingThread.java:134)
at hudson.remoting.PingThread.run(PingThread.java:90)
[... Repeated multiple times...]
mars 28, 2018 2:26:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFOS: Terminated
mars 28, 2018 2:26:45 PM hudson.util.ProcessTree getKillers
AVERTISSEMENT: Failed to obtain killers
hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999 failed. The channel is closing down or has closed down
at hudson.remoting.Channel.call(Channel.java:945)
at hudson.util.ProcessTree.getKillers(ProcessTree.java:159)
at hudson.util.ProcessTree$OSProcess.killByKiller(ProcessTree.java:220)
at hudson.util.ProcessTree$WindowsOSProcess.killRecursively(ProcessTree.java:436)
at hudson.util.ProcessTree.killAll(ProcessTree.java:146)
at hudson.Proc$LocalProc.destroy(Proc.java:384)
at hudson.Proc$LocalProc.join(Proc.java:357)
at hudson.Launcher$RemoteLaunchCallable$1.join(Launcher.java:1304)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:927)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:901)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:850)
at hudson.remoting.UserRequest.perform(UserRequest.java:210)
at hudson.remoting.UserRequest.perform(UserRequest.java:53)
at hudson.remoting.Request$2.run(Request.java:364)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at hudson.remoting.Engine$1$1.run(Engine.java:94)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
... 4 more
[... Repeated multiple times...]
mars 28, 2018 2:26:46 PM hudson.remoting.Request$2 run
AVERTISSEMENT: Failed to send back a reply to the request hudson.remoting.Request$2@34a893f6
hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@71af25fc:JNLP4-connect connection to xxxxxxxxxx/192.168.2.98:9999": channel is already closed
at hudson.remoting.Channel.send(Channel.java:715)
at hudson.remoting.Request$2.run(Request.java:377)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at hudson.remoting.Engine$1$1.run(Engine.java:94)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
... 4 more
mars 28, 2018 2:27:00 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
[... Repeated multiple times...]
mars 28, 2018 2:31:49 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
mars 28, 2018 2:32:00 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
mars 28, 2018 2:32:15 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException Read timed out
mars 28, 2018 2:32:30 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Failed to connect to the master. Will try again: java.net.SocketTimeoutException Read timed out
mars 28, 2018 2:32:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
But when the master is back, then the agent died with the following stacktrace :
mars 28, 2018 2:32:50 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFOS: Master isn't ready to talk to us on http://topvm09.sesame.infotel.com:8080/tcpSlaveAgentListener/. Will try again: response code=503
mars 28, 2018 2:33:01 PM hudson.remoting.jnlp.Main$CuiListener error
GRAVE: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
java.lang.NoClassDefFoundError: jenkins/slaves/restarter/JnlpSlaveRestarterInstaller
at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1.onReconnect(JnlpSlaveRestarterInstaller.java:97)
at hudson.remoting.EngineListenerSplitter.onReconnect(EngineListenerSplitter.java:49)
at hudson.remoting.Engine.innerRun(Engine.java:662)
at hudson.remoting.Engine.run(Engine.java:469)
Caused by: java.lang.ClassNotFoundException: jenkins.slaves.restarter.JnlpSlaveRestarterInstaller
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:171)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 4 more
Please note that changelog of 2.112 says remoting has been updated to 3.18, and I use previous version of agent. If agent version mismatch is the root cause, I would expect Jenkins to complains about the deprecated version of agent. PS : I don't known if this a "core" component issue. |