Problems with ssh connections to hosts behind jumphost

366 views
Skip to first unread message

Johannes Kastl

unread,
Jun 18, 2016, 11:12:04 AM6/18/16
to ansible...@googlegroups.com
Dear all,

I have some hosts that I can only reach via a jumphost. So, my
.ssh/config contains:

Host foobar
...
ProxyCommand ssh -W %h:%p whatever
...

I have a strange, intermittent issue, that I can connect to one of
these hosts via ansible and run a playbook. Running it on more than
one host fails out, sometimes "unreachable", sometimes module errors.
Re-running the playbook on the failing host only works.

Any hints, how to solve this? Or how to look for the error? I thought
about checking ssh multiplexing, pipelining and similiar stuff, but
without an idea where to look I'm kind of in the dark here...

Any help would be highly appreciated!

Johannes

signature.asc

Matt Martz

unread,
Jun 18, 2016, 11:18:10 AM6/18/16
to ansible...@googlegroups.com
You may want to look into raising the MaxStartups sshd config on the bastion.

Maybe also look into increasing `timeout` in Ansible.cfg
--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/576564B4.8070909%40ojkastl.de.
For more options, visit https://groups.google.com/d/optout.


--
Matt Martz
@sivel
sivel.net

Johannes Kastl

unread,
Jun 18, 2016, 11:28:45 AM6/18/16
to ansible...@googlegroups.com
Thanks Matt,

that was quick!

On 18.06.16 17:18 Matt Martz wrote:
> You may want to look into raising the MaxStartups sshd config on the
> bastion.

I'll try 100:30:300 and see, thanks for the suggestion.

> Maybe also look into increasing `timeout` in Ansible.cfg

I already did that, but to no avail...

Johannes

signature.asc

Johannes Kastl

unread,
Jun 19, 2016, 9:36:06 AM6/19/16
to ansible...@googlegroups.com
On 18.06.16 17:18 Matt Martz wrote:
> You may want to look into raising the MaxStartups sshd config on the
> bastion.
>
> Maybe also look into increasing `timeout` in Ansible.cfg

Although timeout is set to 30s and MaxStartups is set to 100:30:300, I
still have intermittent failures.

I tried to set 'serial: 2' in the group_vars, but I am not sure if
this has to be set on play level. Off to read the docs..

Johannes

signature.asc

Mirko Friedenhagen

unread,
Jun 19, 2016, 2:37:38 PM6/19/16
to ansible-project

One thing which once hit me: on MacOSX the file ulimit was only 256. I have about 140 hosts and when our company decided to use a jump host I suddenly ran into problems because the pipelined connections now hit this limit.

Regards
Mirko
--
Sent from my mobile

Johannes Kastl

unread,
Jun 19, 2016, 3:07:07 PM6/19/16
to ansible...@googlegroups.com
Hi Mirko,

On 19.06.16 20:37 Mirko Friedenhagen wrote:
> One thing which once hit me: on MacOSX the file ulimit was only 256. I have
> about 140 hosts and when our company decided to use a jump host I suddenly
> ran into problems because the pipelined connections now hit this limit.

Although I am also on OSX, I only have 8 hosts behind the jumphost. So
I would guess this is not the reason for my failures. Although I guess
pipelining is not enabled by default, and I have not switched it on
(yet), at least I think...

Johannes

signature.asc

Bogdan Mihaescu

unread,
Jun 23, 2016, 10:32:49 AM6/23/16
to Ansible Project
Hello,

I have the same problem. Did you find any solution for this ?

Thanks

Johannes Kastl

unread,
Jun 23, 2016, 2:57:00 PM6/23/16
to ansible...@googlegroups.com
On 23.06.16 11:57 Bogdan Mihaescu wrote:

> I have the same problem. Did you find any solution for this ?

As the hosts are lxc containers running on the jump host (and only
being available via the jump host), I guess it might be due to memory
usage when the commands are being run on all hosts simultaneously.

I'm in the middle of trying out some things, but no, a real solution
did not present itself.

Try (and maybe disable) ssh multiplexing, pipelining and starting the
playbooks with forks and/or serial...

Johannes

signature.asc
Reply all
Reply to author
Forward
0 new messages