Unusual WinRM connection issue

5,382 views
Skip to first unread message

Mike Fennemore

unread,
May 27, 2016, 10:26:32 AM5/27/16
to Ansible Project
I have a selected few workgroup Windows server 2012 R2 servers that give the following error:

<10.128.44.37> ESTABLISH WINRM CONNECTION FOR USER: ansible_user on PORT 5986 TO 10.128.44.37
server_101 | UNREACHABLE! => {
    "changed": false,
    "msg": "ntlm: ('Connection aborted.', error(104, 'Connection reset by peer'))",
    "unreachable": true
}

I am using ntlm with Ansible 2.1.0.0 and pywinrm [kerberos] 2RC4. I have tested the port is open, recreated the listeners, run a curl to the server which delivers a successful 411 response.
Any ideas on further troubleshooting?


Matt Davis

unread,
May 27, 2016, 1:48:53 PM5/27/16
to Ansible Project
Hey Mike,

Unfortunately pywinrm currently has *zero* logging/diagnostic capabilities (something I'd like to correct for troubleshooting stuff like this). Meantime...

A couple of things to try:
- Does it work with Basic auth and a local user on that same box?
- Any chance you could run with Fiddler in the middle? Just run Fiddler on some Windows box, configure it to capture/decrypt HTTPS and to allow external connection, then on your Ansible controller, export HTTPS_PROXY=http://(ip-of-fiddler-box):8888/ and go watch the fun.

I'm mostly just curious where the connection reset is occurring, as there are numerous round-trips involved here (eg, is it NTLM auth failure, resource issue, or something else?).

Thanks,

-Matt

Mike Fennemore

unread,
May 30, 2016, 3:45:48 AM5/30/16
to Ansible Project
For testing locally I'm assuming you mean Test-WSMan -Authentication Basic -Credential <problem account> ? I am currently connecting on 5986 with ignore certificate validation turned on.
So in that case I would add -UseSSL switch on the Test-WSMan. Currently running Test-WSMan -Authentication Basic -Credential <problem account> gives:

Test-WSMAN : <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="2150858974" Machine="Server101"><f:Message>The WinRM client cannot process the request. Unencrypted traffic is currently disabled in the client configuration. Change the client configuration and try the request again. </f:Message></f:WSManFault>
At line:1 char:1

Normally I would say that would mean mean configuring AllowUnencrypted on Winrm Client, however the other working systems do not have this configured.

Running Test-WSMAN -Authentication Negotiate -Credential "<user>" -ComputerName localhost returns:

ProductVendor   : Microsoft Corporation
ProductVersion  : OS: 6.3.9600 SP: 0.0 Stack: 3.0

I will try the Fiddler method shortly and return the results.

Mike Fennemore

unread,
May 31, 2016, 12:16:09 PM5/31/16
to Ansible Project
Seems a little odd but having set the HTTPS_PROXY to the fiddler box, when I run a win_ping to the problem server it does not register any connection in fiddler.

Matt Davis

unread,
Jun 1, 2016, 1:41:50 PM6/1/16
to Ansible Project
Sorry, by "local user" I just meant using a non-domain user via pywinrm/Ansible. But yeah, for Basic to work, you'd have to (temporarily) enable unencrypted auth with something like:

Set-Item WSMan:\localhost\Service\AllowUnencrypted $true

The HTTPS_PROXY not working seems odd- I use it dozens of times a day... Sure you've got it exported? The problem is almost certainly on the control-machine side, as it'd just hang if the envvar worked and Fiddler wasn't configured properly.

skinnedknuckles

unread,
Jun 1, 2016, 6:43:50 PM6/1/16
to Ansible Project
Actually I had to type 

winrm set winrm/config/service '@{AllowUnencrypted="true"}'

before it would work for me.

gdel...@gmail.com

unread,
Jun 4, 2016, 1:11:08 AM6/4/16
to Ansible Project
You can also try to run the below PS script on the hosts to ensure all the WinRm options have been taken care of to allow Ansible to connect to it.

Mike Fennemore

unread,
Jun 6, 2016, 8:20:21 AM6/6/16
to Ansible Project
I'm beginning to think this might be as a result of the problem servers being templated in VMWare perhaps?

J Hawkesworth

unread,
Jun 6, 2016, 8:35:39 AM6/6/16
to Ansible Project
Interesting. 

This change was recently added so you can force the ConfigureRemotingForAnsible.ps1 to generate a new self-signed cert by running like this:

.\ConfigureRemotingForAnsible.ps1 -ForceNewSSLCert true


As its says in the PR 'This is necessary when a CN name changes and the self-signed cert is no longer valid and winRM is not allowing a connection because of winRM SSL validation errors.'

Hope this helps,

Jon

Mike Fennemore

unread,
Jun 6, 2016, 8:48:11 AM6/6/16
to Ansible Project
Thanks Jon, good to see it's being well maintained. Had already gone down the route of the self-signed cert via Powershell unfortunately.
I ran the ConfigureForAnsible.ps1 just in case I had missed something. Seems like the same issue though:

<xx.xx.xx.xx> ESTABLISH WINRM CONNECTION FOR USER: ansible_user@DOMAIN on PORT 5986 TO xx.xx.xx.xx
Server.domain | UNREACHABLE! => {
    "changed": false,
    "msg": "ntlm: ('Connection aborted.', error(104, 'Connection reset by peer'))",
    "unreachable": true
}


J Hawkesworth

unread,
Jun 6, 2016, 7:09:13 PM6/6/16
to Ansible Project
Anything in the event logs? Since it seems to be a connection reset, I'd hope there might be a message on the windows machine to say why.

Christoph Wegener

unread,
Jun 7, 2016, 10:04:42 AM6/7/16
to Ansible Project
If you are referring to cloning a Windows machine without proper sysprep usage then that's very well possible. I remember seeing some WinRM blogs where people had problems due to duplicate SIDs ... not 100% sure though.

Mike Fennemore

unread,
Jun 7, 2016, 11:32:19 AM6/7/16
to ansible...@googlegroups.com

Yes have seen the articles but this was a properly sysprepped template. Have recreated listeners, changed self-signed cert and still seems to yield the same result.

Jon any particular logs I should focus on? The Windows Remote Management and security logs don't seem to show anything out of the ordinary.

--
You received this message because you are subscribed to a topic in the Google Groups "Ansible Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ansible-project/rcYvdFVO9ss/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/16e035a7-72da-4155-b0c6-7407d4ab1825%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

J Hawkesworth

unread,
Jun 7, 2016, 12:18:43 PM6/7/16
to Ansible Project
Sorry, I don't have a specific suggestion where to look.  Sometimes I toss all the event logs and then poke things rather than filter for a specific event category.

One of my colleagues tells me there's an rc6 for pywinrm 0.2 - might be worth trying that if you aren't on it already.

Matt Davis

unread,
Jun 7, 2016, 1:03:03 PM6/7/16
to Ansible Project
Seriously- best thing you could do is figure out why Fiddler isn't working for you and get a trace... Knowing where it's failing in the process can really narrow some things down.

Trond Hindenes

unread,
Jun 7, 2016, 5:03:22 PM6/7/16
to Ansible Project
I would troubleshoot the windows side first. Are you able to psremote from a windows node to the "problem" node using 5986 (ssl)?

Mike Fennemore

unread,
Jun 24, 2016, 8:48:18 AM6/24/16
to Ansible Project
09:12:58:4855 fiddler.network.https> HTTPS handshake to 10.128.44.38 (for #2) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


09:13:34:4067 fiddler.network.https> HTTPS handshake to 10.128.44.38 (for #3) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


09:17:40:7434 fiddler.network.https> HTTPS handshake to 10.128.44.38 (for #4) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


09:18:08:8209 fiddler.network.https> HTTPS handshake to 10.128.44.38 (for #5) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


09:21:23:7477 fiddler.network.https> HTTPS handshake to 10.128.44.38 (for #6) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


14:38:02:7271 fiddler.network.https> HTTPS handshake to 10.128.44.37 (for #2) failed. System.ComponentModel.Win32Exception The client and server cannot communicate, because they do not possess a common algorithm


I should probably add that to be FIPS 140-2 compliant the server have the following:
Protocols: TLS 1.0, TLS 1.1, TLS 1.2
Ciphers Enabled: Triple DES 168, AES 128/128, AES 256/256
Hashes Enabled: SHA, SHA 256, SHA 384, SHA 512
Key Exchanges: PKCS, ECDH

SSL Cipher Suite Order changed:
TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P521,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P521,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P521,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P521,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384_P521,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384_P384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P521,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384_P521,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384_P384,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA_P521,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA_P384,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA_P256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256_P521,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256_P384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256_P256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA_P521,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA_P384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA_P256,TLS_DHE_DSS_WITH_AES_256_CBC_SHA256,TLS_DHE_DSS_WITH_AES_256_CBC_SHA,TLS_DHE_DSS_WITH_AES_128_CBC_SHA256,TLS_DHE_DSS_WITH_AES_128_CBC_SHA,TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA

Mike Fennemore

unread,
Jun 24, 2016, 9:45:27 AM6/24/16
to Ansible Project
It might also help to add that all the servers it seems to be failing on are Windows Server 2012 R2 with IIS installed and a few sites with different SSL Certificates installed.

Mike Fennemore

unread,
Sep 1, 2016, 11:55:26 AM9/1/16
to Ansible Project
So courtesy of a few colleagues we have a solution. By specifying the fqdn in the inventory rather than the ip, and making sure the Ansible control machine could resolve the fqdn to the ip, the connection is now successful.
Reply all
Reply to author
Forward
0 new messages