|
My setup is slightly different from the original description. I do not do any substantial adding/deleting of slaves. I have narrowed my cause down to a substantial number (30+ slaves) being consistently power cycled at a time. I am using Jenkins to perform hardware tests, and one set of tests requires that each slave be rebooted ~200 times. I performed an experiment last night that seems to prevent my issue. My preventative measures are:
1) Switch all slaves to use the Availability setting of "Keep this slave on-line as much as possible, but don't reconnect if temporarily marked offline by the user." This option is added with this plugin: https://github.com/daniel-beck/jenkins-keep-slave-disconnected-plugin
2) Set the script that is power-cycling the slaves to mark the slave as offline using the Jenkins CLI jar file.
For me these steps prevent Jenkins from trying to reconnect to each slave whenever they come online again, so it prevents the SSH plugin in from connecting/disconnecting. I'm not sure this will help in your case of adding/deleting slaves, but I figured I'd throw it out there.
|