Swapping credentials

77 views
Skip to first unread message

Dave York

unread,
Feb 28, 2020, 1:13:52 PM2/28/20
to Ansible Project
Hi Ansible Community!

I have a playbook running against windows servers.  I have one play where I'm connecting as the local administrator, then a second play where I'm connecting as a domain user.  I'm confused on how to do this.  I'm running from Ansible Tower so I have the domain user as the machine credentials applied.  

How do I tell the  second play to use the domain account (machine credentials) after telling the first play to use the local admin account?  Any help appreciated, im pretty new to Ansible.

hostsserverA.internal.domain
  vars
    ansible_userAdministrator
    ansible_passwordXXXXXXXXXXXX
  gather_factsno
  connectionwinrm
  port5985

  tasks:
  - debug:
      varhostvars[inventory_hostname]
      verbosity1


hostsserverA.internal.domain
  vars
    ansible_user??machine credential??
    ansible_passwordXXXXXXXXXXXX
  gather_factsno
  connectionwinrm
  port5985


Jordan Borean

unread,
Feb 28, 2020, 7:48:20 PM2/28/20
to Ansible Project
What you have there is one way but by default WinRM only allows local administrators to connect to the host so you need to make sure you either the domain user is also a local admin or adjust the WinRM security to allow non-admins to connect.

Another option is to define the host twice in your inventory like so

[windows]
serverA_local  ansible_host
=serverA.internal.domain ansible_user=administrator ansible_password=pass
serverA_domain  ansible_host
=serverA.internal.domain ansible_user=DOMAIN\user ansible_password=pass

[windows:vars]
ansible_connection
=winrm
ansible_port
=5985

In your play you would set hosts: serverA_local for the local inventory entry and hosts: serverA_domain for the domain inventory.

Thanks

Jordan

Dave York

unread,
Feb 29, 2020, 10:33:38 PM2/29/20
to Ansible Project
Thanks Jordan, I think you kicked me in the right direction, but still missing something.  I'm following your guidance somewhat, but I'm adding the inventory within the playbook instead of in the inventory:

  - nameadd new host staging_domain to inventory
    add_host
      namestaging_domain
      ansible_hostserverA.internal.domain
      ansible_user'{{ ansible_user }}'
      ansible_password'{{ ansible_password }}'
      ansible_connectionwinrm
      ansible_port5985

  - nameadd new host staging_localadmin to inventory
    add_host
      namestaging_localadmin
      ansible_hostserverA.internal.domain
      ansible_userAdministrator
      ansible_password'{{ randopass }}'
      ansible_connectionwinrm
      ansible_port5985

The above works when I connect to staging_localadmin, but does NOT when I connect to staging_domain.  

When connecting to staging_domain, I get:

plaintext: the specified credentials were rejected by the server

I'm running this from tower, so the {{ ansible_user }} and {{ ansible_password }} I'm passing staging_domain should be the machine credentials.  I verified this with some debug statements.  

Dave York

unread,
Mar 1, 2020, 12:13:20 AM3/1/20
to Ansible Project
Further troubleshooting makes this seem like it has something to do with time (GPO applying maybe?) 

I can run another job with the same connection to staging_domain and eventually it starts working.

I'm still trying to figure it out, ill post back here if I find anything 

Dave York

unread,
Mar 1, 2020, 12:48:19 AM3/1/20
to Ansible Project
I can't tell what changes, but while ansible is trying to connect, it throws this error in the event log:

Log Name: System
Event ID: 10111
Level: Warning
Source: Microsoft-Windows-WinRM
Description:

User authentication using Basic Authentication scheme failed.

Unexpected error received from LogonUser 1326: %%1326

Jordan Borean

unread,
Mar 1, 2020, 5:47:01 AM3/1/20
to Ansible Project
Plaintext means basic auth over http which is rejected by windows because it is not encrypted. Basic auth also does not work for domain accounts but unfortunately it is the default for backwards compatibility reasons when the username specified is not in the UPN format.

If you are connecting to a domain account you can set ansible_winrm_transport: ntlm to get you going but I highly recommend you get Kerberos auth working for domain accounts.

Dave York

unread,
Mar 1, 2020, 5:11:48 PM3/1/20
to Ansible Project
Acknowledged.  I've been trying to stick with Kerberos now, but STILL having issues..

The machine credentials I use are service...@ALLUPPERCASE.DOMAIN and right after vmware_guest builds the VM, I try to continue on but now I get:

kerberos: the specified credentials were rejected by the server, plaintext: the specified credentials were rejected by the server

However, I still see the same behavior..  I get that error, and minutes later I can run the job again and get past it.  I'm able to logon to the server right after vwmare_guest finishes with the service account..  

pullin my hair out here, not sure whats going on

Jordan Borean

unread,
Mar 1, 2020, 5:31:12 PM3/1/20
to Ansible Project
The fact that you were able to get a Kerberos ticket showed that your host is set up to get the tickets correctly. Some things you should check
  • The domain account is a local admin, non admins can technically connect through WinRM but not by default. In any case Ansible is very limited with what it can do when connecting as a non-admin account so it's not something we usually document
  • The time is synced between your Ansible controller and the Windows server
  • You aren't using message encryption. This should be done automatically but some older libraries that Ansible uses may not have it available. To check set 'ansible_winrm_message_encryption: always' just to double check message encryption is available and works

Also you should set `ansible_winrm_transport: kerberos' to stop the fallback to Basic auth. Unfortunately this is also another backwards compatibility issue which we can't take away but isn't something that is really optimal.

Dave York

unread,
Mar 1, 2020, 5:33:30 PM3/1/20
to Ansible Project
You can actually see kerberos failing within the same play... It will run various commands then just randomly run into one that it gets the kerberos error on.

ansible-krb.png


This is what that play looks like in yaml:
 tasks
  - nameEnsure SMBv1 is disabled
    win_optional_feature:
      namesmb1protocol
      stateabsent  
  
  - nameInitialize Disk 1
    win_shellInitialize-Disk -Number 1
    ignore_errorsyes
    
  - nameWait 15 seconds for disk initilization
    pause
      seconds15
  
  - namePartition Disk 1
    win_partition:
      drive_letterE
      partition_size-1
      disk_number1
      statepresent
    ignore_errorsyes #Ignore errors because this module doesn't handle idempotency well

  - nameFormat Disk 1 as E drive
    win_format:
      drive_letterE
      file_systemNTFS
      new_labelDATA
    ignore_errorsyes #Ignore errors because this module doesn't handle idempotency well

  - nameEnsure SMBv1 is disabled
    win_optional_feature:
      namesmb1protocol
      stateabsent    

Dave York

unread,
Mar 1, 2020, 5:38:29 PM3/1/20
to Ansible Project
Thanks again for the help on this.

I double verified the machine credential is a domain admin, and verified that time is in-sync between the ansible tower host and the domain.

I'll try setting ansible_winrm_transport: kerberos and ansible_winrm_message_encryption: always and see what happens

Dave York

unread,
Mar 1, 2020, 5:57:18 PM3/1/20
to Ansible Project

First run looks the same:

ansible-krb2.png

Dave York

unread,
Mar 1, 2020, 5:58:06 PM3/1/20
to Ansible Project

Second Run (from failure) gets further (?!?!)

ansible-krb3.png

Dave York

unread,
Mar 1, 2020, 9:28:56 PM3/1/20
to Ansible Project
I've taken to just brute-force running the same playbook over and over again until the issue goes away.  I still suspect GPO or replication or time... or something

However - one clue - When the kerberos error happens, I see this generated in the log files:

 Log Name:      System
Source:        Microsoft-Windows-WinRM
Date:          3/1/2020 6:16:34 PM
Event ID:      10154
Task Category: None
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      hostname.internal.domain
Description:
The WinRM service failed to create the following SPNs: WSMAN/hostname.internal.domain; WSMAN/hostname. 

 Additional Data 
 The error received was 1355: %%1355.

 User Action 
 The SPNs can be created by an administrator using setspn.exe utility.

Jordan Borean

unread,
Mar 2, 2020, 12:02:41 AM3/2/20
to Ansible Project
If you have multiple DCs then potentially it could be replication at fault here but usually if a host is missing from the domain controller it queries then a different error is shown (service not found in the database).

Is the host you are connecting to sharing the same hostname as an older host that it's potentially replacing? If so the SPN could be registered under the newer host on 1 DC but still not been replicated to another DC where it still thinks hostname is another host. Each host would technically have it's own unique key and when the server goes to check the credentials it is unable to decrypt the secret because it's using a different key than the one the DC thought it had (older host) and thus think the credentials were bad.

Dave York

unread,
Mar 2, 2020, 9:11:35 PM3/2/20
to Ansible Project
I think you got it figured out Jordan.

I tried with a object that didn't previously exist and it worked.

I've been manually deleting the old computer objects beforehand, but I dont think I've been giving it enough time to replicate (our AD structure is messy/slow right now)

I'll probably work a 'delete computer object' and 'wait 5 minutes' into my vm provisioning script (the one we've been working with here)

Appreciate the help once again!
Reply all
Reply to author
Forward
0 new messages