We've been seeing job termination due to slave disconnects, and would like help in solving this issue.
Jenkins 2.140
ssh-slaves-plugin 1.29.4
Ubuntu 16.04.5
Hypervisor, plugin Libvirt 1.8.6
Slave guest: Ubuntu 18.04.1
Jenkins and the VMs are all running on the same machine, so network activity shouldn't be an issue.
I've been looking at the wiki note here:
https://wiki.jenkins.io/display/JENKINS/Remoting+issueand the anomaly I've noticed is repeated in the slave.log file created by Jenkins (SocketTimeoutException):
Feb 06, 2019 8:37:58 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/remoting as a remoting work directory
Both error and output logs will be printed to /home/jenkins/remoting
<===[JENKINS REMOTING CAPACITY]===>channel started
Remoting version: 3.25
This is a Unix agent
Evacuated stdout
Agent successfully connected and online
ERROR: Connection terminated
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
ERROR: Socket connection to SSH server was lostjava.net.SocketTimeoutException: The connect timeout expired
at com.trilead.ssh2.Connection$1.run(Connection.java:762)
at com.trilead.ssh2.util.TimeoutService$TimeoutThread.run(TimeoutService.java:91)
Slave JVM has not reported exit code before the socket was lost
[02/06/19 08:41:05] [SSH] Connection closed.
The remoting log on the slave has
Feb 06, 2019 8:41:05 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2671)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3146)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:858)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:354)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
and the /var/log/jenkins/jenkins.log has:
Feb 06, 2019 8:41:05 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel ubuntu-122-2
java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Feb 06, 2019 8:41:12 AM hudson.slaves.RetentionStrategy$Demand check
INFO: Disconnecting computer ubuntu-122-1 as it has been idle for 1 min 2 sec
Feb 06, 2019 8:41:12 AM hudson.plugins.libvirt.VirtualMachineSlaveComputer disconnect
INFO: Virtual machine "ubuntu-20170406-122-1" (slave "ubuntu-122-1") is to be shut down.reason: Offline because computer was idle; it will be relaunched when needed. (hudson.slaves.OfflineCause$IdleOfflineCause)
The syslog file on the slave doesn't indicate any anomalies between the startup and termination initiated by Jenkins:
Feb 6 08:39:31 VirtualBox systemd[1]: Starting Clean php session files...
Feb 6 08:39:31 VirtualBox systemd[1]: Started Clean php session files.
Feb 6 08:41:43 VirtualBox systemd[1]: Stopping User Manager for UID 998...
Feb 6 08:41:43 VirtualBox systemd[1294]: Stopped target Default.
Thanks.