[JIRA] (JENKINS-38704) Docker JNLP launcher should include -noReconnect option

37 views
Skip to first unread message

regs@akom.net (JIRA)

unread,
Oct 4, 2016, 11:26:01 AM10/4/16
to jenkinsc...@googlegroups.com
Alexander Komarov created an issue
 
Jenkins / Improvement JENKINS-38704
Docker JNLP launcher should include -noReconnect option
Issue Type: Improvement Improvement
Assignee: magnayn
Components: docker-plugin
Created: 2016/Oct/04 3:25 PM
Environment: Jenkins 2.23
Docker plugin: 0.16.2
Docker: 1.11.1 (Using swarm)
Priority: Minor Minor
Reporter: Alexander Komarov

For reasons I don't understand, my installation produces a number of orphaned docker containers every day or so (although most of the time everything works). There are a number of variants, this issue concerns this one:

+ java -jar /home/jenkins/slave.jar -jnlpUrl http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp -secret XXXXXXX
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up slave: Bigmemory-Swarm-6b70e1946dc5
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins.url.hidden:8080/]
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins.url.hidden:34232
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP3-connect not supported, skipping
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP2-connect
Oct 03, 2016 4:18:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Failing to obtain http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true
java.io.IOException: Failed to load http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true: 404 Not Found
        at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:280)
        at hudson.remoting.Launcher.run(Launcher.java:224)
        at hudson.remoting.Launcher.main(Launcher.java:197)
Waiting 10 seconds before retry
Failing to obtain http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true
java.io.IOException: Failed to load http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true: 404 Not Found
        at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:280)
        at hudson.remoting.Launcher.run(Launcher.java:224)
        at hudson.remoting.Launcher.main(Launcher.java:197)
Waiting 10 seconds before retry
Failing to obtain http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true
java.io.IOException: Failed to load http://jenkins.url.hidden:8080//computer/Bigmemory-Swarm-6b70e1946dc5//slave-agent.jnlp?encrypt=true: 404 Not Found

The retry continues forever, tying up the Docker swarm. I have to kill and remove the containers manually. I haven't been able to catch the master logs that correspond to this yet, since this happens only occasionally and it's a busy installation, so I don't know why the master doesn't know about the slave it's launching - perhaps a job abort before it is provisioned, or just a job abort at any point.

I have the Cloud Connection Timeout and Read Timeout set to 10, and the individual template timeouts set, but of course these do not affect the operation of the JNLP jar. The only option I see from the jar is -noReconnect, which would give up on the first 404 in this case.

I don't know with complete certainty that this would work for everyone, so perhaps it would be helpful to make it an option? Outside of this, I am left with having to rig up hacky monitoring and kill-old-containers scripts.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages