RHEL6.6 and ControlPersist

859 views
Skip to first unread message

Dag Wieers

unread,
Oct 15, 2014, 10:57:39 AM10/15/14
to ansible...@googlegroups.com
Hi,

As some of you may know, Red Hat backported the ControlPersist
functionality to the OpenSSH version that ships with RHEL6.

This is terrific since RHEL users can now use this technique to speed up
Ansible.

However, after some testing it seems to fail for the very first
connection. What happens is that the first connection, when the persistent
connection has not been set up yet, fails. Any subsequent connection seems
to work fine, but obviously this fails to work properly with Ansible.

I think this is a bug, has anyone tested this ?
Or am I doing something wrong here ?

--
Dag

Todd Zullinger

unread,
Oct 15, 2014, 5:45:14 PM10/15/14
to ansible...@googlegroups.com
I noticed this after updating to one of the later test packages from
the bugzilla ticket where this was requested (it did not occur
initially). I also thought it might be something peculiar to my
environment and didn't notice it soon enough to bring it up in the
ticket, unfortunately.

I haven't installed the final update yet (on CentOS), but I was hoping
that it might have corrected the problem. At least I planned to test
the latest package before considering it a package bug.

Knowing it seems to affect more than me, I imagine a new bug report is
in order to resolve the problem.

It is fantastic to have ControlPersist on EL6 though. ;)

--
Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Habit, n. A shackle for the free.
-- Ambrose Bierce, "The Devil's Dictionary"

Michael DeHaan

unread,
Oct 15, 2014, 7:07:47 PM10/15/14
to ansible...@googlegroups.com
"I think this is a bug, has anyone tested this ?"

Sounds like this should be reported with RHEL, definitely.

Please do and post the bugzilla here if you can.

While we could add special code to say "don't try CP on EL6 if ssh says it CAN CP" it seems shipping the feature broken would have been detected by them.






--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-project+unsubscribe@googlegroups.com.
To post to this group, send email to ansible-project@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/20141015170811.GX10769%40zaya.teonanacatl.net.

For more options, visit https://groups.google.com/d/optout.

Kevin Fenzi

unread,
Oct 15, 2014, 7:49:57 PM10/15/14
to ansible...@googlegroups.com
On Wed, 15 Oct 2014 16:07:42 -0700
Michael DeHaan <mic...@ansible.com> wrote:

> "I think this is a bug, has anyone tested this ?"
>
> Sounds like this should be reported with RHEL, definitely.
>
> Please do and post the bugzilla here if you can.
>
> While we could add special code to say "don't try CP on EL6 if ssh
> says it CAN CP" it seems shipping the feature broken would have been
> detected by them.

I ran into this as well, but haven't had too much time to isolate it.

However, I did find that controlpersist works with ansible, but if you
enable pipelining, then it hangs on the first connection.

So, perhaps there's a issue around pipelining?

kevin


signature.asc

Todd Zullinger

unread,
Oct 15, 2014, 9:02:02 PM10/15/14
to ansible...@googlegroups.com
With pipelining = True commented out on el6, things still fail for me
on the first run. They do fail with a different error, but perhaps
that's just due to slightly different code paths with pipelining on
and off, I haven't looked at the ansible code at all.

In case it matters, I adjusted the control_path variable to avoid
using ~/.ansible since $HOME is on NFS in my environment.

With pipelining disabled:

$ ansible web -m ping -o
...
web12 | FAILED => failed to resolve remote temporary directory from /var/tmp/ansible-tmp-1413420275.15-6744099385111: `mkdir -p /var/tmp/ansible-tmp-1413420275.15-6744099385111 && chmod a+rx /var/tmp/ansible-tmp-1413420275.15-6744099385111 && echo /var/tmp/ansible-tmp-1413420275.15-6744099385111` returned empty string

With pipelining enabled:

$ ansible web -m ping -o
...
web12 | FAILED >> {"failed": true, "msg": "", "parsed": false}

This is still using openssh-5.3p1-100.el6.x86_64 which was a scratch
build that Petr made per RHBZ #953088. I have not checked whether
there are any differences in the -104 packages included in the latest
RHEL updates (and they haven't made it to my CentOS mirror yet).

This is also with ansible-1.6.10-1.el6 from EPEL. I have not yet
updated to 1.7.x (I see 1.7.2-1 is in epel-testing now, so I'll
prbably wait for that to hit stable).

HTH,

--
Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Until you spread your wings, you'll have no idea how far you can walk.
-- Demotivators (www.despair.com)

Kevin Fenzi

unread,
Oct 15, 2014, 9:45:03 PM10/15/14
to ansible...@googlegroups.com
ansible-1.7.2-1.el6.noarch
openssh-5.3p1-104.el6.x86_64

with pipeline disabled:

% ansible -m ping -o junk02\*
junk02.phx2.fedoraproject.org | success >> {"changed": false, "ping": "pong"}

with pipeline enabled:

% ansible -m ping -o junk02\*
junk02.phx2.fedoraproject.org | FAILED >> {"failed": true, "msg": "", "parsed": false}

kevin
signature.asc

Jacob Weber

unread,
Oct 24, 2014, 6:46:54 PM10/24/14
to ansible...@googlegroups.com
Same results here. I have ansible 1.7.2-2 and openssh-5.3p1-104 (installed from the CentOS Continuous Releases Repository), on CentOS 6.5.

Michael DeHaan

unread,
Oct 24, 2014, 6:51:51 PM10/24/14
to ansible...@googlegroups.com
"with pipeline disabled:"

Hmmmmm, curious.

Worst case we could detect RHEL 6 and auto-disable pipelining on that platform, what say ye?


Jacob Weber

unread,
Oct 24, 2014, 6:58:41 PM10/24/14
to ansible...@googlegroups.com
Just to clarify -- I was seeing the same results as Kevin, not Todd. Removing "pipelining = True" from my config fixed the problem.

Since that's the default, I don't know if you really need to change Ansible -- it never worked on RHEL6 before, so nothing's changed in that regard. But I hope they're able to fix this issue in OpenSSH....did someone report it?

Todd Zullinger

unread,
Oct 24, 2014, 7:24:21 PM10/24/14
to ansible...@googlegroups.com
Jacob Weber wrote:
> Just to clarify -- I was seeing the same results as Kevin, not Todd.
> Removing "pipelining = True" from my config fixed the problem.

I've since updated to 1.7.2 from epel-testing and I see the same
results as you and Kevin. I've disabled pipelining as well for now,
but it's definitely a performance hit (better to be slower and
accurate though).

> But I hope they're able to fix this issue in OpenSSH....did someone
> report it?

I haven't seen anything myself. I don't have a RHEL contract so I was
hoping someone that did might file a ticket so it's more likely to get
attention.

--
Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the world didn't suck, we'd all fall off.

Michael DeHaan

unread,
Oct 24, 2014, 9:59:25 PM10/24/14
to ansible...@googlegroups.com
I've filed a github for now to include (in 1.8) a check to auto-disable pipelining on RHEL 6.6+ (but not EL7), which should resolve most of the confusion.

We also may make it print a warning if it was on.

But yeah, bugzilla seems appropriate.

Bugzilla from someone with a nice friendly Red Hat TAM even more so :)



--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-project+unsubscribe@googlegroups.com.
To post to this group, send email to ansible-project@googlegroups.com.

Adam Miller

unread,
Oct 29, 2014, 8:16:15 AM10/29/14
to ansible...@googlegroups.com
On Fri, Oct 24, 2014 at 8:59 PM, Michael DeHaan <mic...@ansible.com> wrote:
> I've filed a github for now to include (in 1.8) a check to auto-disable
> pipelining on RHEL 6.6+ (but not EL7), which should resolve most of the
> confusion.
>
> We also may make it print a warning if it was on.
>
> But yeah, bugzilla seems appropriate.
>
> Bugzilla from someone with a nice friendly Red Hat TAM even more so :)
>
>

I've submitted a ticket to my team's TAM through Red Hat support
channels, if/when there is a public bugzilla as a side effect I'll
link it here.

-AdamM

>
> On Fri, Oct 24, 2014 at 7:24 PM, Todd Zullinger <t...@pobox.com> wrote:
>>
>> Jacob Weber wrote:
>>>
>>> Just to clarify -- I was seeing the same results as Kevin, not Todd.
>>> Removing "pipelining = True" from my config fixed the problem.
>>
>>
>> I've since updated to 1.7.2 from epel-testing and I see the same results
>> as you and Kevin. I've disabled pipelining as well for now, but it's
>> definitely a performance hit (better to be slower and accurate though).
>>
>>> But I hope they're able to fix this issue in OpenSSH....did someone
>>> report it?
>>
>>
>> I haven't seen anything myself. I don't have a RHEL contract so I was
>> hoping someone that did might file a ticket so it's more likely to get
>> attention.
>>
>> --
>> Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> If the world didn't suck, we'd all fall off.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Ansible Project" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to ansible-proje...@googlegroups.com.
>> To post to this group, send email to ansible...@googlegroups.com.
> --
> You received this message because you are subscribed to the Google Groups
> "Ansible Project" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ansible-proje...@googlegroups.com.
> To post to this group, send email to ansible...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ansible-project/CA%2BnsWgwojcjpMkpjHMNzgHV7hdzd8AV3YLC-TqBZMHFNBx4n2g%40mail.gmail.com.

mto...@go2uti.com

unread,
Oct 29, 2014, 4:55:16 PM10/29/14
to ansible...@googlegroups.com, maxam...@fedoraproject.org

    When I saw this discussion thread, I was thrilled because I am using OEL6.5 with OpenSSH-5.3.  Since that is equivalent to RHEL6.5, that meant that there should be an update to OpenSSH for OEL too.  Sure enough, there is (openssh-5.3p1-104.el6.x86_64).  But when I installed it, I could no longer use Ansible to copy files to other servers.  I did not make any other changes (ssh_args, scp_if_ssh, control_path, and pipelining are all still commented out), and my ssh_config does not include any of the "Control*" parameters.  Output from my quick tests are below:

Without "scp_if_ssh":
sinudy36-> ansible sinudm07 -m copy -a "src=testfile dest=/var/tmp"
sinudm07 | FAILED >> {
    "failed": true,
    "md5sum": "d41d8cd98f00b204e9800998ecf8427e",
    "msg": "\u001b]2;pdxmft @ :/home/pdxmft\u0007/usr/bin/python: can't open file '\u001b]2': [Errno 2] No such file or directory\r\n/bin/sh: pdxmft: command not found\r\n",
    "parsed": false
}

With "scp_if_ssh":
sinudy36-> ansible sinudm07 -m copy -a "src=testfile dest=/var/tmp"
sinudm07 | FAILED => failed to transfer file to /home/pdxmft/.ansible/tmp/ansible-tmp-1414615053.51-185971585352740/source:

scp: /home/pdxmft/.ansible/tmp/ansible-tmp-1414615053.51-185971585352740/source: No such file or directory

    This is identical to what I observed when I tried to specify ssh instead of paramiko (-c ssh) and prior to upgrading OpenSSH.  I opened another thread about that yesterday (Cannot copy a file to a server when using ssh) before I saw this thread, but at that time I had to specify "-c ssh" in the command line to get this reaction while here it just happens regardless of what I do.  Meanwhile normal scp and sftp from the command line functions just fine; I get the failures only when I try to use Ansible to copy files.
    Something about the way that Ansible is calling scp or sftp appears to trigger this bug in the latest version of OpenSSH.  This will be a major problem as it now means I cannot upgrade OpenSSH for any reason and still be able to use Ansible.  For now I have rolled back to openssh-5.3p1-94.el6.  I hope somebody finds a solution soon.
    -Mark

mto...@go2uti.com

unread,
Oct 29, 2014, 6:51:17 PM10/29/14
to ansible...@googlegroups.com


    I deployed a new OEL6.5 server, then upgraded OpenSSH to the new ControlPersist release and installed Ansible onto it.  Without making any configuration changes to anything other than adding server names to the hosts file, I tried to use Ansible to copy a file to another server, and it failed as before.  I downgraded OpenSSH to the previous version and tried the copy again.  It worked perfectly.
    So, there is definitely a mismatch of some sort between Ansible and the newer release of OpenSSH on Linux 6.
    -Mark

Pythagoras Watson

unread,
Nov 5, 2014, 4:36:21 PM11/5/14
to ansible...@googlegroups.com
I ran into similar issues using the new ControlPersist option as well as the ProxyCommand option.  A Red Hat bugzilla was created that has the details.  I think the part in comment 1 starting at "The commands and output below show that ControlPersist=yes does not work as expected." is what you are referring to.  There is a patch for openssh-5.3p1-104.el6.src.rpm attached to the bug that is from me backporting code from the RHEL 7 openssh related to the ControlPersist option.  I don't run Ansible, so I have no way of testing to see if the patch fixes your issue.  However, I would be interested to know if it does.

--
Py


On Wednesday, October 15, 2014 7:57:39 AM UTC-7, Dag Wieers wrote:

Azul Inho

unread,
Nov 11, 2014, 6:35:37 AM11/11/14
to ansible...@googlegroups.com
just a heads up,

I run RH6.5, not able to upgrade at the moment to 6.6 (and it looks like it wouldn't help either), I have worked around the ControlPersist issue by installing a openssh6 client on my control host box (/opt/openssh6),
I then have a wrapper script that calls ansible-playbook and sets the PATH to collect ssh and friends from /opt/openssh6/bin before /usr/bin.

because it only uses the openssh client (no daemons running), there's no conflict with the normal redhat packages. 
Its so much faster

Dag Wieers

unread,
Nov 13, 2014, 8:41:11 AM11/13/14
to ansible...@googlegroups.com
On Tue, 11 Nov 2014, Azul Inho wrote:

> just a heads up,
>
> I run RH6.5, not able to upgrade at the moment to 6.6 (and it looks like it
> wouldn't help either), I have worked around the ControlPersist issue by
> installing a openssh6 client on my control host box (/opt/openssh6),
> I then have a wrapper script that calls ansible-playbook and sets the PATH
> to collect ssh and friends from /opt/openssh6/bin before /usr/bin.
>
> because it only uses the openssh client (no daemons running), there's no
> conflict with the normal redhat packages.
> Its so much faster

Let me break you the news that Red Hat has released an openssh update that
fixes the reported issue with ControlPersist.

* Thu Nov 06 2014 Petr Lautrbach <plau...@redhat.com> 5.3p1-104.1
- Fix ControlPersist option with ProxyCommand (#1160487)

And it works well. Joy !

--
Dag

mto...@go2uti.com

unread,
Nov 13, 2014, 2:58:13 PM11/13/14
to ansible...@googlegroups.com

    I checked the Oracle repository and found openssh-5.3p1-104.el6_6.1.  I installed that and tested.  Nice!!!  It looks like that patch fixed it.
    -Mark

Jacob Weber

unread,
Nov 13, 2014, 4:23:17 PM11/13/14
to ansible...@googlegroups.com
Got it on CentOS too, and turned pipelining on. Can't say that I'm seeing much of a performance difference, but I'm not getting the errors either. I'll do some testing on a longer playbook later.

Jacob Weber

unread,
Nov 13, 2014, 7:10:09 PM11/13/14
to ansible...@googlegroups.com
Yeah, my ~50 minute playbook is down to about 46 minutes now. Not sure why I'm not seeing the difference that others are. I do see the ControlPersist files being created in ~ansible/cp. It's running about 80 plays on each of about 20 hosts. I guess the SSH part of Ansible wasn't adding that much overhead to begin with.
Reply all
Reply to author
Forward
0 new messages