[JIRA] [swarm-plugin] (JENKINS-31514) Jenkins Swarm slave goes remains offline after master restarts

42 views
Skip to first unread message

choonming.goh@kpn.com (JIRA)

unread,
Nov 12, 2015, 4:02:01 AM11/12/15
to jenkinsc...@googlegroups.com
Choon Ming Goh created an issue
 
Jenkins / Bug JENKINS-31514
Jenkins Swarm slave goes remains offline after master restarts
Issue Type: Bug Bug
Assignee: Kohsuke Kawaguchi
Components: swarm-plugin
Created: 12/Nov/15 9:01 AM
Environment: swarm-plugin 2.0
jenkins 1.627
Priority: Minor Minor
Reporter: Choon Ming Goh

I'm currently using the swarm plugin to connect all my slaves to the master. However, whenever the Jenkins service on the master gets restarted, the Jenkins slave will remain offline. It will only come back online when I restart the jenkins swarm plugin process.

Nov 11, 2015 10:06:32 AM org.apache.commons.httpclient.HttpMethodBase getResponseBody
WARNING: Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
Attempting to connect to https://suct2v420.it.mgt:8443/ 98ecac62-d76a-4734-9f9f-9350ee5b4e7d with ID c3c7e53b
Could not obtain CSRF crumb. Response code: 404
javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No name matching suct2v420.it.mgt found
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1904)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:279)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:273)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1446)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:209)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:901)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:837)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1023)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1332)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1359)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1343)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:269)
at hudson.plugins.swarm.SwarmClient.connect(SwarmClient.java:229)
at hudson.plugins.swarm.Client.run(Client.java:106)
at hudson.plugins.swarm.Client.main(Client.java:69)
Caused by: java.security.cert.CertificateException: No name matching suct2v420.it.mgt found
at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:208)
at sun.security.util.HostnameChecker.match(HostnameChecker.java:93)
at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:347)
at sun.security.ssl.AbstractTrustManagerWrapper.checkAdditionalTrust(SSLContextImpl.java:919)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:886)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1428)
... 14 more
Failed to establish JNLP connection to https://suct2v420.it.mgt:8443/
Retrying in 10 seconds

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265)
Atlassian logo

grayaii@gmail.com (JIRA)

unread,
Nov 14, 2016, 9:15:02 PM11/14/16
to jenkinsc...@googlegroups.com
Alex Gray commented on Bug JENKINS-31514
 
Re: Jenkins Swarm slave goes remains offline after master restarts

You can use a service like "supervisor" that will automatically start the swarm jar on the slave if it ever goes down. That is what we use. We have the process retry X times with a sleep of Y seconds in between each attempt. That way, we can restart our master for maintenance, and when it is back online the agents will magically re-connect.

This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo

o.v.nenashev@gmail.com (JIRA)

unread,
Feb 26, 2018, 3:28:15 AM2/26/18
to jenkinsc...@googlegroups.com
Oleg Nenashev assigned an issue to Unassigned
 

KK does not maintain this plugin anymore. Moving to unassigned to set the expectation

Change By: Oleg Nenashev
Assignee: Kohsuke Kawaguchi
This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

me@basilcrow.com (JIRA)

unread,
May 23, 2019, 11:18:03 AM5/23/19
to jenkinsc...@googlegroups.com
Basil Crow commented on Bug JENKINS-31514
 
Re: Jenkins Swarm slave goes remains offline after master restarts

This should be working nowadays. You need to use the -deleteExistingClients so that the Swarm Client can connect after the restart. See PipelineJobTest#buildShellScriptAfterRestart for a working example from a unit test.

This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

me@basilcrow.com (JIRA)

unread,
Jun 1, 2019, 1:46:02 PM6/1/19
to jenkinsc...@googlegroups.com
Basil Crow closed an issue as Not A Defect
 
Change By: Basil Crow
Status: Open Closed
Resolution: Not A Defect
Reply all
Reply to author
Forward
0 new messages