Bulk issue update: The plugin connectivity is still unstable from what I see in this and other reports. Probably the recent patches in 1.24-1.25 caused some extra instability by getting rid of interlocks between agent connection and termination logic. Apparently it impacts some reconnection scenarios due to the race conditions.
Unfortunately I do not have capacity to work on the plugin in medium-term. So for now I am unassigning issues from myself. Ivan Fernandez Calvo was very kind to take ownership of the plugin and to handle some workload in it. Probably he will have some capacity to review the backlog I was unable to triage.
After upgrading to SSH Slaves plugin 1.25 from 1.23 we started seeing these errors in the logs. Slaves started to report clock differences of 10 seconds and ping times upwards of 17 seconds.
The master would then get pegged @ 100% CPU usage and all builds would start to fail
===================== Errors {code} WARNING: Failed to monitor mesos-jenkins-ad54e4db86ab493cbfcbf40fc586031d-mesos for Free Temp Space java.util.concurrent.ExecutionException: java.lang.Error: Failed to deserialize the Callable object. at hudson.remoting.Channel$2.adapt(Channel.java:943) at hudson.remoting.Channel$2.adapt(Channel.java:938) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:96) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305) Caused by: java.lang.Error: Failed to deserialize the Callable object. at hudson.remoting.UserRequest.perform(UserRequest.java:192) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:360) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at hudson.remoting.Engine$1$1.run(Engine.java:98) at java.lang.Thread.run(Thread.java:748) at ......remote call to JNLP4-connect connection from ip-10-89-142-141.us-west-2.compute.internal/10.89.142.141:48042(Native Method) at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1654) at hudson.remoting.UserResponse.retrieve(UserRequest.java:311) at hudson.remoting.Channel$2.adapt(Channel.java:941) ... 4 more Caused by: hudson.remoting.RemotingSystemException: java.lang.InterruptedException at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:273) at com.sun.proxy.$Proxy6.fetch(Unknown Source) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.getDeclaredMethod(Class.java:2128) at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at hudson.remoting.UserRequest.deserialize(UserRequest.java:275) at hudson.remoting.UserRequest.perform(UserRequest.java:186) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:360) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at hudson.remoting.Engine$1$1.run(Engine.java:98) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at hudson.remoting.Request.call(Request.java:169) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:260) ... 38 more {code} {code} Jan 22, 2018 7:41:37 PM hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor monitor WARNING: Failed to monitor mesos-jenkins-9fe9204c80af4f5c98e67cf012e24a39-mesos for Free Disk Space java.util.concurrent.ExecutionException: java.io.InvalidClassException: hudson.FilePath; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7135276226716035594 at hudson.remoting.Channel$2.adapt(Channel.java:943) at hudson.remoting.Channel$2.adapt(Channel.java:938) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:96) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305) Caused by: java.io.InvalidClassException: hudson.FilePath; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7135276226716035594 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at hudson.remoting.UserRequest.deserialize(UserRequest.java:275) at hudson.remoting.UserRequest.perform(UserRequest.java:186) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:360) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at hudson.remoting.Engine$1$1.run(Engine.java:98) at java.lang.Thread.run(Thread.java:748) at ......remote call to JNLP4-connect connection from ip-10-89-142-71.us-west-2.compute.internal/10.89.142.71:35556(Native Method) at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1654) at hudson.remoting.UserResponse.retrieve(UserRequest.java:311) at hudson.remoting.Channel$2.adapt(Channel.java:941) ... 4 more {code} {code} Jan 22, 2018 7:41:37 PM hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor monitor WARNING: Failed to monitor mesos-jenkins-9fe9204c80af4f5c98e67cf012e24a39-mesos for Free Temp Space java.util.concurrent.ExecutionException: java.io.InvalidClassException: hudson.FilePath; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7135276226716035594 at hudson.remoting.Channel$2.adapt(Channel.java:943) at hudson.remoting.Channel$2.adapt(Channel.java:938) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:96) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:305) Caused by: java.io.InvalidClassException: hudson.FilePath; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7135276226716035594 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at hudson.remoting.UserRequest.deserialize(UserRequest.java:275) at hudson.remoting.UserRequest.perform(UserRequest.java:186) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:360) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at hudson.remoting.Engine$1$1.run(Engine.java:98) at java.lang.Thread.run(Thread.java:748) at ......remote call to JNLP4-connect connection from ip-10-89-142-71.us-west-2.compute.internal/10.89.142.71:35556(Native Method) at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1654) at hudson.remoting.UserResponse.retrieve(UserRequest.java:311) at hudson.remoting.Channel$2.adapt(Channel.java:941) ... 4 more
Jan 22, 2018 7:41:38 PM org.jenkinsci.plugins.mesos.MesosSlave getRootPath WARNING: IO exception while absolutizing slave root path: java.io.IOException: remote file operation failed: jenkins at hudson.remoting.Channel@1e42b1f:JNLP4-connect connection from ip-xxxxxx.us-west-2.compute.internal/xxxxxx:48042: java.io.IOException: Remote call on JNLP4-connect connection from ip-xxxxxx.us-west-2.compute.internal/1xxxx:48042 failed {code}
It is probably related to the way the timeout was managed, this would change in the last snapshot version, also I've found JENKINS-59764 on docker-plugin that could be also related