Problems Creating VMs From Template

33 views
Skip to first unread message

AJ

unread,
Apr 9, 2020, 9:21:35 AM4/9/20
to Ansible Project
I have created a playbook to spin up three VMs from a vSphere template and add configurations.
The VM creation portion looks like this:
- name: Clone {{ webtier_vm_template }} Template as Web1 & Apply Network Configuration
  vmware_guest
:
    hostname
: "{{ vcenter_ip }}"
    username
: "{{ vcenter_username }}"
    password
: "{{ vcenter_password }}"
    validate_certs
: False
    name
: "{{ web1_hostname }}"
   
template: "{{ webtier_vm_template }}"
    datacenter
: "{{ datacenter_name }}"
    folder
: /AJLAB Infrastructure
    cluster
: "{{ cluster_name }}"
    networks
:
     
- name: "{{ webtier_vmnet }}"
        ip
: "{{ web1_ip }}"
        netmask
: "{{ webtier_mask }}"
        gateway
: "{{ webtier_gateway }}"
        type
: static
        start_connected
: True
    customization
:
      domain
: ajlab.local
      hostname
: "{{ web1_hostname }}"
      dns_servers
:
       
- 172.16.92.100
    wait_for_ip_address
: yes
    state
: poweredon


I've been fighting and fighting to even get the network configuration to work properly. With CentOS 7 I first installed open vmtools and the network changes never take place, the Ansible play hangs forever waiting for the network to come up.
I tried adding "Requires=dbus.service" and "After=dbus.service" to the [Unit] section of vmtoolsd.service and got the same behavior. I noticed if I remove open vmtools and install vmware's vmtools, the network changes work properly and everything is perfect.
Except I'm having problems with the Python 2 dependency in CentOS 7.
The application I'm trying to run on the VMs is written in Python 3 and I can't figure out how to get around the need to use SCL (I realized the SCL command creates a subshell which causes Ansible to wait/hang forever).
So I tried CentOS 8 for the VMs. I noticed CentOS 8 comes with open vmtools already installed and I had the same problem getting the network changes to push. I uninstalled open vmtools and installed vmware's tools, now the network changes do work but after the IP changes take place the VM is not reachable (ping or SSH) at the network layer. However, if I console into the VM and ping out from the VM the network immediately comes up.
As I'm writing this I'm thinking a tcpdump on the VM would be interesting to see because it kinda feels like ARPs either aren't getting to the VM or the VM is ignoring them, then as soon as the VM pings out the rest of the network learns the MAC and all is well.
So the Ansible playbook actually continues because network is fully configured, but when the next task (dnf packages) kicks off, the playbook bombs out because the VM is inaccessible via SSH.
I thought about embedding a script in the template that pings out for a while at first boot, but decided that was too hokey and shouldn't be necessary.

Any ideas or recommendations? thanks!

David Foley

unread,
Apr 9, 2020, 9:30:06 AM4/9/20
to Ansible Project
I'm using Ansible to Deploy Windows machines: having no issues, out side of your Code: Outside of your code I'm not using both of the following:  
  •  start_connected: True
  •    wait_for_ip_address: yes 
Not sure if Static IP Address settings need to have login set to use to configure the Static IP 

customization:
            hostname: "{{ VM_Name }}"
            dns_servers:
            - 
            dns_suffix:
            - 
            domain: "{{ domain }}"
            autologon: yes
            password: "{{ local_pass }}"
            runonce:
            - 

But i also see you don't have   delegate_to: localhost

David Foley

unread,
Apr 9, 2020, 9:32:02 AM4/9/20
to Ansible Project
If i can remember Correctly: I had the Same issue with Static IP not taken affect, as my AutoLogin wasn't working, Once i fixed the Password on the template, Static IP Address were taken effect.

AJ

unread,
Apr 9, 2020, 11:28:02 AM4/9/20
to Ansible Project
Thanks for the replies, I appreciate it. I haven't tried it but I don't think autologon applies since I'm deploying Linux VMs. In the Ansible documentation autologon is under the section "Parameters related to Windows customization:"
For a while I was using delegate_to: localhost but I removed it because after reading the documentation it seemed to me that delegate_to is basically an override for the hosts: parameter and since I'm already running the VM tasks with hosts: localhost I didn't think I needed it (maybe I misunderstood the documentation?).
At any rate, I still had the problem when I was using delegate_to: localhost.

The REALLY weird thing is, it's working now. I've ran this playbook dozens of times against this template and had the problem. I created a VM from the template, installed tcpdump, and converted it to a new template. Pointed the playbook to it and it worked.
Thinking maybe tcpdump somehow modified the network stack I pointed my playbook back to the original CentOS 8 template (without tcpdump) and it's still working.

Now I'm confused. We'll see how reliable it is.

AJ

unread,
Apr 9, 2020, 11:37:26 AM4/9/20
to Ansible Project
Nope it's unreliable. Ran it a few more times, it completely worked a couple times, then the 3rd time it worked on the first VM and failed on the 2nd VM because the network was unavailable.
So with open vmtools it doesn't even change the guest network settings.
With native vmtools it changes all the guest network settings, but roughly 75% of the time the IP is unreachable unless I "jump start" it by consoling into the guest and initiate some kind of traffic from the VM.

David Foley

unread,
Apr 9, 2020, 3:18:12 PM4/9/20
to Ansible Project
Can you test on a Different Linux OS, like Ubuntu and a Different Network vSwitch 
Reply all
Reply to author
Forward
0 new messages