AWX jobs stopped to work ~10 days ago

103 views
Skip to first unread message

Edvinas Kairys

unread,
Jul 4, 2023, 10:06:18 AM7/4/23
to AWX Project
Hello,

I've a question, my AWX playbooks with modules nxos_commands and iosxr_commands started to fail ~10days ago. I'm using AWX version 20.0.1 with new custom built EE. When i'm trying to switch to and older custom EE the playbooks starts to work. The errors i'm getting is like these:

The full traceback is:
  File "/runner/requirements_collections/ansible_collections/cisco/iosxr/plugins/module_utils/network/iosxr/iosxr.py", line 122, in get_capabilities
    capabilities = Connection(module._socket_path).get_capabilities()
  File "/usr/local/lib/python3.9/site-packages/ansible/module_utils/connection.py", line 200, in __rpc__
    raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [HK2ASR01]: FAILED! => {
    "changed": false,
    "failed_when_result": "The conditional check 'get_config.stdout[0] is not search(\\"\\\\/32\\") or get_config.stdout[1] is not search(\\"\\\\/128\\")' failed. The error was: error while evaluating conditional (get_config.stdout[0] is not search(\\"\\\\/32\\") or get_config.stdout[1] is not search(\\"\\\\/128\\")): 'dict object' has no attribute 'stdout'. 'dict object' has no attribute 'stdout'",
    "invocation": {
        "module_args": {
            "commands": [
                "show object-group network ipv4 External_Monitoring | include /32",
                "show object-group network ipv6 External_Monitoring_IPv6 | include /128"
            ],
            "interval": 1,
            "match": "all",
            "retries": 10,
            "wait_for": null
        }
    },
    "msg": "command timeout triggered, timeout value is 30 secs.\\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."


OR

ok: [TEONET01A] => {"changed": false, "failed_when_result": false, "msg": "command timeout triggered, timeout value is 30 secs.\\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."}

seems like it's something with Python. The older EE is using Python 3.8, and the newer one - Python 3.9. Could it be the reason ? Isn't the EE enviroment is independent of the whole AWX ? 

AWX Project

unread,
Jul 5, 2023, 2:59:46 PM7/5/23
to AWX Project
Hi,

In your newer custom EE did you change the collection version or ansible version? Are you able to provide the `ansible-galaxy collection list` and ansible --version from inside the EE? 

Thanks,

AWX Team

Edvinas Kairys

unread,
Jul 6, 2023, 8:42:40 AM7/6/23
to awx-p...@googlegroups.com
Hello,

Thank for reply

When building an image vith ansible builder i just added the collection named:
- name: cisco.iosxr

Now as per your requested information i see:

Newer EE when playbook fails:

        "stdout": "ansible [core 2.15.0]\n  config file = None\n  configured module search path = ['/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']\n  ansible python module location = /usr/local/lib/python3.9/site-packages/ansible\n  ansible collection location = /runner/requirements_collections:/runner/.ansible/collections:/usr/share/ansible/collections\n  executable location = /usr/local/bin/ansible\n  python version = 3.9.16 (main, Dec  8 2022, 00:00:00) [GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] (/usr/bin/python3)\n  jinja version = 3.1.2\n  libyaml = True",
        "stdout_lines": [
            "ansible [core 2.15.0]",
            "  config file = None",
            "  configured module search path = ['/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']",
            "  ansible python module location = /usr/local/lib/python3.9/site-packages/ansible",
            "  ansible collection location = /runner/requirements_collections:/runner/.ansible/collections:/usr/share/ansible/collections",
            "  executable location = /usr/local/bin/ansible",
            "  python version = 3.9.16 (main, Dec  8 2022, 00:00:00) [GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] (/usr/bin/python3)",
            "  jinja version = 3.1.2",
            "  libyaml = True"

        "stdout_lines": [
            "",
            "# /runner/requirements_collections/ansible_collections",
            "Collection              Version",
            "----------------------- -------",
            "ansible.netcommon       5.1.2  ",
            "ansible.utils           2.10.3 ",
            "cisco.iosxr             6.0.0  ",
            "community.general       7.1.0  ",
            "community.mysql         3.7.2  ",
            "community.proxysql      1.5.1  ",
            "",
            "# /usr/share/ansible/collections/ansible_collections",
            "Collection              Version",
            "----------------------- -------",
            "amazon.aws              6.0.1  ",
            "ansible.netcommon       5.1.1  ",
            "ansible.posix           1.5.4  ",
            "ansible.utils           2.10.3 ",
            "ansible.windows         1.14.0 ",
            "awx.awx                 22.3.0 ",
            "azure.azcollection      1.15.0 ",
            "cisco.iosxr             5.0.2  ",      
            "cisco.nxos              4.3.0  ",     
            "community.vmware        3.6.0  ",
            "google.cloud            1.1.3  ",
            "kubernetes.core         2.4.0  ",
            "netbox.netbox           3.13.0 ",
            "openstack.cloud         2.1.0  ",
            "ovirt.ovirt             3.1.2  ",
            "redhatinsights.insights 1.0.7  ",
            "theforeman.foreman      3.10.0 "
        ]
    }
}


Old EE where it works OK:

      "ansible [core 2.12.5.post0]",
      "  config file = None",
      "  configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']",
      "  ansible python module location = /usr/local/lib/python3.8/site-packages/ansible",
      "  ansible collection location = /runner/requirements_collections:/home/runner/.ansible/collections:/usr/share/ansible/collections",
      "  executable location = /usr/local/bin/ansible",
      "  python version = 3.8.13 (default, Jun 24 2022, 15:27:57) [GCC 8.5.0 20210514 (Red Hat 8.5.0-13)]",
      "  jinja version = 2.11.3",
      "  libyaml = True"
    ],


 "stdout_lines": [
      "",
      "# /runner/requirements_collections/ansible_collections",
      "Collection         Version",
      "------------------ -------",
      "ansible.netcommon  5.1.2  ",
      "ansible.utils      2.10.3 ",
      "cisco.iosxr        6.0.0  ",
      "community.general  7.1.0  ",
      "community.mysql    3.7.2  ",
      "community.proxysql 1.5.1  ",
      "",
      "# /usr/share/ansible/collections/ansible_collections",
      "Collection              Version",
      "----------------------- -------",
      "amazon.aws              4.1.0  ",
      "ansible.posix           1.4.0  ",
      "ansible.utils           2.6.1  ",
      "ansible.windows         1.11.0 ",
      "awx.awx                 21.4.0 ",
      "azure.azcollection      1.13.0 ",
      "community.general       5.4.0  ",
      "community.vmware        2.7.0  ",
      "google.cloud            1.0.2  ",
      "kubernetes.core         2.3.2  ",
      "openstack.cloud         1.8.0  ",
      "ovirt.ovirt             2.2.2  ",
      "redhatinsights.insights 1.0.7  ",
      "theforeman.foreman      3.4.0  "


Its strange that on the EE which is not working i see double entry of IOS-XR collection - maybe that could be the reason it's not working ? I can't imagine how it got there - and why it was working before. Also - i don't quite sure that this could be a problem, because other playbook with cisco_nxos collection also started to fail even the entry is not doubled.

p.s - is there any easier way to get into EE for example from POD side and run these commands (ansible --version) instead of adding it to playbooks and running via AWX ? 

Thank you

 

--
You received this message because you are subscribed to a topic in the Google Groups "AWX Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/awx-project/sYLb77hYU2k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to awx-project...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/9db6cf9b-2733-4eb9-81f4-938020f3f4d6n%40googlegroups.com.

AWX Project

unread,
Jul 7, 2023, 1:35:58 PM7/7/23
to AWX Project
Hello,
Thank you for providing this additional information! Based on what you have provided we'd like to gather a bit more information from you. Are you building this with Ansible Builder? If so, could you please provide us with the Ansible execution environment yaml file? 

Thank you for your time!

- AWX Team

Edvinas Kairys

unread,
Jul 7, 2023, 2:08:44 PM7/7/23
to awx-p...@googlegroups.com
hello,

Today i tried to build a fresh image, i'was using git clone https://github.com/ansible/awx-ee
and my edited execution-environment.yml file looks like this: (just added some collections and Python modules)

---
version: 3
images:
  base_image:
    name: quay.io/centos/centos:stream9
dependencies:
  ansible_core:
    # Require minimum of 2.15 to get ansible-inventory --limit option
    package_pip: ansible-core>=2.15.0rc2,<2.16
  ansible_runner:
    package_pip: ansible-runner
  galaxy: |
    ---
    collections:
      - name: awx.awx
      - name: azure.azcollection
      - name: amazon.aws
      - name: theforeman.foreman
      - name: google.cloud
      - name: openstack.cloud
      - name: community.vmware
      - name: ovirt.ovirt
      - name: kubernetes.core
      - name: ansible.posix
      - name: ansible.windows
      - name: redhatinsights.insights
      - name: ansible.netcommon
      - name: ansible.utils
      - name: cisco.nxos
      - name: netbox.netbox
      - name: cisco.iosxr

  system: |
    git-core [platform:rpm]
    python3.9-devel [platform:rpm compile]
    libcurl-devel [platform:rpm compile]
    krb5-devel [platform:rpm compile]
    krb5-workstation [platform:rpm]
    subversion [platform:rpm]
    subversion [platform:dpkg]
    git-lfs [platform:rpm]
    sshpass [platform:rpm]
    rsync [platform:rpm]
    epel-release [platform:rpm]
    python-unversioned-command [platform:rpm]
    unzip [platform:rpm]
  python: |
    git+https://github.com/ansible/ansible-sign
    ncclient
    paramiko
    pykerberos
    pyOpenSSL
    pypsrp[kerberos,credssp]
    pywinrm[kerberos,credssp]
    toml
    pexpect>=4.5
    python-daemon
    pyyaml
    six
    netaddr
    genie
    netbox
    pyats
    pynetbox

additional_build_steps:
  append_base:
    - RUN $PYCMD -m pip install -U pip
  append_final:
    - COPY --from=quay.io/ansible/receptor:devel /usr/bin/receptor /usr/bin/receptor
    - RUN mkdir -p /var/run/receptor
    - RUN git lfs install --system


The errors i'm getting when running the playbook:

}
redirecting (type: connection) ansible.builtin.network_cli to ansible.netcommon.network_cli
Loading collection ansible.netcommon from /runner/requirements_collections/ansible_collections/ansible/netcommon
redirecting (type: connection) ansible.builtin.network_cli to ansible.netcommon.network_cli
Loading collection ansible.netcommon from /runner/requirements_collections/ansible_collections/ansible/netcommon

The full traceback is:
  File "/runner/requirements_collections/ansible_collections/cisco/iosxr/plugins/module_utils/network/iosxr/iosxr.py", line 122, in get_capabilities
    capabilities = Connection(module._socket_path).get_capabilities()
  File "/usr/local/lib/python3.9/site-packages/ansible/module_utils/connection.py", line 200, in __rpc__
    raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [HK2ASR02]: FAILED! => {
    "changed": false,

    "invocation": {
        "module_args": {
            "commands": [
                "show object-group network ipv4 External_Monitoring | include /32",
                "show object-group network ipv6 External_Monitoring_IPv6 | include /128"
            ],
            "interval": 1,
            "match": "all",
            "retries": 10,
            "wait_for": null
        }
    },
    "msg": "command timeout triggered, timeout value is 30 secs.\\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."
}


This is how the task looks like:

     - name: GET ipv4 and ipv6 object-groups from devices
       iosxr_command:
         commands:
             - "show object-group network ipv4 External_Monitoring | include /32"
             - "show object-group network ipv6 External_Monitoring_IPv6 | include /128"
       register: get_config


The only thing what i see is differs from older EE, is that older uses Python 3.8, and the New one is Python3.9. Maybe there are some kind of problems along with modules when using Python 3.9 ?

Thank you.


Reply all
Reply to author
Forward
0 new messages