win_copy failing (timeout)

880 views
Skip to first unread message

Justin Dugan

unread,
Aug 31, 2016, 10:21:36 AM8/31/16
to Ansible Project
I am using this in the playbook:

- name: copy {{eap_dir}}.0.zip
  win_copy: src="{{eap_dir}}.0.zip" dest="c:/temp/{{eap_dir}}.0.zip"


And it's failing with:

TASK [win_JBoss : copy jboss-eap-6.4.0.zip] ************************************
 [WARNING]: FATAL ERROR DURING FILE TRANSFER: Traceback (most recent call last):   File
"/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py", line 204, in _winrm_exec
self._winrm_send_input(self.protocol, self.shell_id, command_id, data, eof=is_last)   File
"/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py", line 185, in
_winrm_send_input     rs = protocol.send_message(xmltodict.unparse(rq))   File "/usr/lib/python2.7
/site-packages/winrm/protocol.py", line 207, in send_message     return
self.transport.send_message(message)   File "/usr/lib/python2.7/site-packages/winrm/transport.py",
line 173, in send_message     response = self.session.send(prepared_request,
timeout=self.read_timeout_sec)   File "/usr/lib/python2.7/site-packages/requests/sessions.py", line
596, in send     r = adapter.send(request, **kwargs)   File "/usr/lib/python2.7/site-
packages/requests/adapters.py", line 499, in send     raise ReadTimeout(e, request=request)
ReadTimeout: HTTPSConnectionPool(host='host', port=5986): Read timed
out. (read timeout=30)

fatal: [jcinstalltest]: FAILED! => {"failed": true, "msg": "winrm send_input failed"}

Is there any way to adjust the timeout? This file is ~200Mb. I also have a patch to copy which is ~400Mb so 30 seconds is probably too short.

Thanks,

Justin

Justin Dugan

unread,
Aug 31, 2016, 10:23:19 AM8/31/16
to Ansible Project
I have also tried unarchive which fails with the same error.

J Hawkesworth

unread,
Sep 1, 2016, 3:00:40 AM9/1/16
to Ansible Project
win_copy is still unfortunately not great for large files.  From the testing that I did earlier in the year it is still slower than fetching the same size file via http and there seems to be a max size, although this isn't something I've hit myself.

What I do is add an http server (nginx) to my ansible controllers (there are lots of roles on galaxy to do this for you to chose from) and then use win_get_url to fetch the files back onto the windows boxes.  You could use any web server but I can vouch for nginx working well.  You can also use force=no with win_get_url which will only download files if they have a newer timestamp than the current version, and nginx sends the correct headers for this to work, so its a good combination.

Happy to go into more detail if you need it.

Hope this helps,

Jon

Peter Rebholz

unread,
Oct 13, 2016, 11:10:34 AM10/13/16
to Ansible Project
I'm also running into this issue and spent some time troubleshooting. In my case, the host I'm pushing the file to is on a separate network without incoming access to where we host the files, thus the proposed workaround of using `get_url` does not work.

In my troubleshooting, I've found out the following details:

1. The timeout used by `pywinrm` is not relevant because the files are transferred via many small requests. You can play with this setting by defining the vars: `ansible_winrm_read_timeout_sec` and `ansible_winrm_operation_timeout_sec`. While the timeout was reflected in the error message, it had no effect.
2. The temp file created by the `win_copy` always tops out at the same size: 110,840 KB
3. The `winrm` connector uses 250,000 byte chunks to transfer the file. If you change the `buffer_size` parameter in `ansible/plugins/connection/winrm.py` to something larger, then the temp file on the windows size will be larger than the 110,840 KB mentioned previously
4. If you bump that `buffer_size` up enough, you can successfully transfer the whole file.
5. By running the following command, I've always received the result "457" when the process fails, regardless of `buffer_size`. This seems to indicate that there is some bound that is being exceeded but I have not been able to figure out if it's a problem with ansible code or the WinRM service configuration on the server.

    ansible-playbook -vvvvv -l windows -i inventory playbook.yml | grep "WINRM PUT" | wc -l

That's as far as I've gone... Hopefully someone more familiar with the WinRM connector may have some ideas...

Peter

Matt Davis

unread,
Oct 13, 2016, 1:53:02 PM10/13/16
to Ansible Project
Every case I've seen of this issue has come down to a problem deep in an SSL/TLS implementation that causes the tunnel to get wedged. I've not dug in far enough with the packet sniffer/TLS debugging to be sure which side is the problem (Windows SChannel or OpenSSL), but on the machines I've seen it on, it's 100% reproducible. There's not really anything we can do about it at the Ansible level, as it's many dependencies away from us (Ansible->pywinrm->requests->urllib3->pyopenssl->OpenSSL). 

The only way I've been able to correct the problem on machines I've seen it on is by recompiling Python against a newer OpenSSL build. Switching up allowed ciphers on the Windows or OpenSSL side generally seems to just move the problem around (ie, it fails in a different place but still quite predictably). Switching to HTTP instead of HTTPS also makes the problem go away, but, well, don't do that. Hoping to get some of the message-level HTTP encryption stuff going in pywinrm soon (at least for Kerberos, and jborean93 has done it for CredSSP), which could be another way to make this go away in the future.

-Matt

Peter Rebholz

unread,
Oct 13, 2016, 6:20:37 PM10/13/16
to Ansible Project
Thanks for the info, Matt.

I've tried a number of versions of OpenSSL without any luck:

macOS El Capitan (where I originally had this problem) has Python 2.7.10 with OpenSSL 0.9.8zh
FreeBSD 11.0-RELEASE has Python 2.7.12 with OpenSSL 1.0.2j
FreeBSD 11.0-RELEASE with manual build of Python 2.7 (latest) with manual build of OpenSSL 1.1.0b

Peter

Matt Davis

unread,
Oct 13, 2016, 8:01:35 PM10/13/16
to Ansible Project
Yeah, I never found a packaged Mac python that did the right thing. Recompiling Python against a compiled-by-me latest OpenSSL was the only way I got the issue to go away (I had also tweaked the default cipher list to "best practices" using IISCrypto, but that alone won't fix it with the Apple-supplied Xcode python). 

I added some diagnostic stuff to urllib3 to dump the actual cipher/proto that were negotiated to see if I could narrow it down to specific combos of working/failing between the different versions, but at least from my initial research, it didn't seem to matter (though it wasn't exactly what you'd call scientific or exhaustive).

This one is hairy- I wish there were something more we could do with it, but I'm not sure what that would be.
Reply all
Reply to author
Forward
0 new messages