Docker-plugin - ssh connection refused when connecting to slave

1,256 views
Skip to first unread message

seol...@gmail.com

unread,
Oct 20, 2017, 5:03:45 AM10/20/17
to Jenkins Users
Hi,
I am getting this error on various versions of Jenkins and the docker-plugin including latest of both.
My Jenkins master is running in a container. It successfully launches a docker container for the slave (using the evarga/jenkins-slave image) but then fails to connect with the error below.
I can see my slave running with a docker ps and can even successfully connect to it via ssh on 0.0.0.0:32791 (or whatever the given ephmeral port is) using the same credentials configured in the docker agent template section.
Any ideas really welcome as I'm now going round in circles trying to get this working.
Thanks.


[10/20/17 08:13:45] [SSH] Opening SSH connection to 0.0.0.0:32791.
Connection refused (Connection refused)
SSH Connection failed with IOException: "Connection refused (Connection refused)".
java.io.IOException: There was a problem while connecting to 0.0.0.0:32791
	at com.trilead.ssh2.Connection.connect(Connection.java:834)
	at com.trilead.ssh2.Connection.connect(Connection.java:703)
	at com.trilead.ssh2.Connection.connect(Connection.java:617)
	at hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:1284)
	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:804)
	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:793)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at com.trilead.ssh2.transport.TransportManager.establishConnection(TransportManager.java:367)
	at com.trilead.ssh2.transport.TransportManager.initialize(TransportManager.java:480)
	at com.trilead.ssh2.Connection.connect(Connection.java:774)
	... 9 more
[10/20/17 08:13:45] Launch failed - cleaning up connection
[10/20/17 08:13:45] [SSH] Connection closed.

seol...@gmail.com

unread,
Oct 20, 2017, 5:38:01 AM10/20/17
to Jenkins Users
Just to add that the Docker plugin also recognises the slave container as running as shown in this attachment.

nicolas de loof

unread,
Oct 20, 2017, 6:12:27 AM10/20/17
to jenkins...@googlegroups.com
to use your own SSH credentials you need to disable the "SSH key management" option (title is unclear) 
with this option set the container is configured to rely on jenkins master Identity ssh key

I'm working on a UI refactoring to avoid such confusing options

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/9359318d-f78f-432a-bd06-4c10768f2cdc%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Martin Heg

unread,
Oct 20, 2017, 6:16:56 AM10/20/17
to jenkins...@googlegroups.com
Thanks for your reply Nicolas. Yes I saw that suggestion in one of your other posts and have already tried it but still seeing the same issue unfortunately.

--
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-users/cEKuC4GwP5I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CANMVJzkXMLjco0tzp_rcZpcyQ-4xFMQNi9%3DKAAEK6-Z8r%3DLYMg%40mail.gmail.com.

nicolas de loof

unread,
Oct 20, 2017, 8:06:27 AM10/20/17
to jenkins...@googlegroups.com
Is your jenkins master running in a docker container as well ? 

stacktrace seems to demonstrate failure to establish a route to 0.0.0.0:32791, even before any SSH authentication attempt

Martin Heg

unread,
Oct 20, 2017, 8:25:27 AM10/20/17
to jenkins...@googlegroups.com
Yes - jenkins master running in container.
From the docker host I can manually ssh into the slave container with: 
ssh -p ephmeral_port jen...@0.0.0.0
If I docker exec into the jenkins master I can also ssh into the slave container using the docker host ip:
ssh -p ephmeral_port jenkins@hostip

hmmmm.....

Martin Heg

unread,
Oct 20, 2017, 8:35:09 AM10/20/17
to jenkins...@googlegroups.com
Frustrating thing is that I definitely had this same exact setup working previously (jenkins master in docker container spawning slaves in sibling containers), albeit in the context of docker toolbox (boot2docker) running on windows, whereas I am now running against standard docker on linux.

Martin Heg

unread,
Oct 20, 2017, 8:51:54 AM10/20/17
to jenkins...@googlegroups.com
What I should add once more just to clarify is that the slave container does actually get started (and hence why I am able to try manually ssh-ing into it) - it stays running for a few minutes and then exits.

nicolas de loof

unread,
Oct 20, 2017, 9:11:47 AM10/20/17
to jenkins...@googlegroups.com
issue here is 0.0.0.0 (localhost) from jenkins master is a container, not the docker host, so can't connect to ssh slave.
I assume you have master to access docker.sock to run this sibling container ? So master has no way to guess the external IP

Martin Heg

unread,
Oct 20, 2017, 10:16:31 AM10/20/17
to jenkins...@googlegroups.com
Aha - yes I have 'Docker Host URI' set to 'unix:///var/run/docker.sock', but what I was probably missing was to set the 'Docker Hostname' under the Advanced section in the Cloud settings... once I set that to the ip address of the docker host I am having some more success...

Now slave containers are being spawned and builds are running in them :), but all is not quite right..

Even though the container is created and destroyed after the build the UI is still showing the 'node' as offline. See attached screenshot.

Looking at the log for any of these shows that the docker host ip address is now being used for routing instead of 0.0.0.0 but the log still shows the 'connection refused' error... (I've masked out ip address in the paste below)..

Despite this error, the build history for the node shows that it ran the build successfully..

[10/20/17 14:06:44] [SSH] Opening SSH connection to 10.x.x.x:32842.
Connection refused (Connection refused)
SSH Connection failed with IOException: "Connection refused (Connection refused)".
java.io.IOException: There was a problem while connecting to 10.x.x.x:32842
	at com.trilead.ssh2.Connection.connect(Connection.java:834)
	at com.trilead.ssh2.Connection.connect(Connection.java:703)
	at com.trilead.ssh2.Connection.connect(Connection.java:617)
	at hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:1284)
	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:804)
	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:793)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at com.trilead.ssh2.transport.TransportManager.establishConnection(TransportManager.java:367)
	at com.trilead.ssh2.transport.TransportManager.initialize(TransportManager.java:480)
	at com.trilead.ssh2.Connection.connect(Connection.java:774)
	... 9 more
[10/20/17 14:06:44] Launch failed - cleaning up connection
[10/20/17 14:06:44] [SSH] Connection closed.



Thank you so much for your help Nicolas.

screenshot.11.jpg

nicolas de loof

unread,
Oct 20, 2017, 10:59:58 AM10/20/17
to jenkins...@googlegroups.com
looks like the build termination isn't successfully caught to terminate this node.
I'm trying to stabilize the "launch" mechanism in 1.0.x, then will try to address the more general lifecycle for docker agents, being created as a task enter the queue and destroyed on completion. My experiments on one-shot-executor give me some ideas on how to improve this process without being too much hack-ish :D

Martin Heg

unread,
Oct 20, 2017, 11:06:00 AM10/20/17
to jenkins...@googlegroups.com
Sounds good - if it helps with your analysis I've noticed that first build after a restart always seems to work fine with the executor being destroyed correctly. Subsequent builds then suffer from the 'offline'/non-terminated executor.
Thanks.

nicolas de loof

unread,
Oct 20, 2017, 11:43:24 AM10/20/17
to jenkins...@googlegroups.com
thanks for reporting this. Most probably I've only ran tests for various launchers only once :P

Alejandro Villarreal

unread,
Jan 18, 2018, 8:26:03 PM1/18/18
to Jenkins Users
@Martin / @Nicolas, have you ever seen that after failing to connect to a sibling container (due to a missing "Docker Hostname"), and stopping that container (from the Docker plugin section under Manage Jenkins), a new container never gets created to service new build requests from projects? I have the master setup to not perform builds so when I request a new build from the project, there are no executors that can service it... a new container should be spawned for this, no? Am I missing something? Could this be a different issues?
Reply all
Reply to author
Forward
0 new messages