Project Sync Failures

96 views
Skip to first unread message

callm...@gmail.com

unread,
Sep 26, 2023, 12:12:22 PM9/26/23
to AWX Project
Hello all,

We did a big upgrade of AWX from 15.0.1 to 23.1.0.
To do so I dumped the DB, created another postgres cluster, database, and user, granting access so the user can access the database.
Afterwards, I used the AWX Operator in OpenShift 4 to create an AWX instance.

After a lot of back and forth I was able to get AWX up and running, I can log into it as I am able to in my old environment (using creds stored in AD).

I've tried sending curl requests to add a host to inventory and that works, moving through the UI seems to work, but my projects aren't syncing reliably.

I've seen cases where a sync will succeed, but the majority of sync attempts fail.
The logs aren't much use:

Here's the entirety of the logs:

Enter passphrase for /var/tmp/awx_89945_1pdcl45c/artifacts/89945/ssh_key_data:
Identity added: /var/tmp/awx_89945_1pdcl45c/artifacts/89945/ssh_key_data (awx@awxpoc-web-8f97cb7fc-khxrp)

PLAY [Update source tree if necessary] *****************************************

TASK [Update project using git] ************************************************

The logs from the task pod:

2023-09-26 16:06:24,921 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 controller node chosen

2023-09-26 16:06:24,921 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 execution node chosen

2023-09-26 16:06:25,066 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 waiting

2023-09-26 16:06:25,421 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 pre run

2023-09-26 16:06:25,440 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 preparing playbook

2023-09-26 16:06:25,512 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 running playbook

2023-09-26 16:06:25,537 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 work unit id received

2023-09-26 16:06:25,577 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 work unit id assigned

2023-09-26 16:06:25,812 INFO     [-] awx.main.wsrelay Producer 10.129.6.253-schedules-changed has no subscribers, shutting down.

2023-09-26 16:06:31,119 INFO     [11066f637064479eb13a7e75a49b5e86] awx.main.commands.run_callback_receiver Starting EOF event processing for Job 89949

2023-09-26 16:06:31,123 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 post run

2023-09-26 16:06:31,417 INFO     [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 finalize run

2023-09-26 16:06:31,423 WARNING  [11066f637064479eb13a7e75a49b5e86] awx.main.dispatch project_update 89949 (failed) encountered an error (rc=None), please see task stdout for details.

This is what I see on the web pod:

2023-09-26 16:08:02,219 INFO     [afd1ae5ddf624e02a06e123b25ad9b19] awx.analytics.job_lifecycle projectupdate-89950 created

10.130.5.200 - - [26/Sep/2023:16:08:02 +0000] "POST /api/v2/projects/527/update/ HTTP/1.1" 202 2265 "https://awx.apps.ocpazt001.csx.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "10.92.172.184"

[pid: 244|app: 0|req: 27/3147] 10.130.5.200 () {70 vars in 2450 bytes} [Tue Sep 26 16:08:01 2023] POST /api/v2/projects/527/update/ => generated 2265 bytes in 456 msecs (HTTP/1.1 202) 15 headers in 635 bytes (1 switches on core 0)

If I rsh into the task or web pod, I can run a git clone and it is successful.

I've tried creating a project with new creds; however, the issue appears to be the same.

Thanks,

Shawn

callm...@gmail.com

unread,
Sep 26, 2023, 12:33:16 PM9/26/23
to AWX Project
Using the Developer Tools from Chrome, I ran a sync and captured the following:

General Headers:
Request Method:
POST
Status Code:
202 Accepted
Remote Address:
Referrer Policy:
strict-origin-when-cross-origin
Request Headers:
Accept:
application/json, text/plain, */*
Accept-Encoding:
gzip, deflate, br
Accept-Language:
en-US,en;q=0.9
Connection:
keep-alive
Content-Length:
0
Cookie:
CFCLIENT_TCIS=issch%3D0%23issm%3D0%23security%5Faccess%3D3%2E1%2C1%2E1%2C12%2E2%2C14%2E1%2C4%2E1%2C5%2E2%2C6%2E1%2C8%2E1%2C10%2E1%2C13%2E1%2C7%2E1%2C9%2E1%23racf%3DJ8683%23company%3D2139%23username%3DSingh%2C%20Radesh%23ruserid%3D94164%23sectaccess%3D1%2C0%2C1%2C1%2C2%2C1%2C1%2C1%2C1%2C1%2C0%2C2%2C1%2C1%2C%23jobtype%3D9%23archive%3Dfalse%23ishd%3D0%23securitysections%3D%23notify%5Frefresh%5Frate%3D300%23orglevel%3D800%23assigngroup%3D4332%2C31232%2C32192%2C19828%2C22569%2C4334%2C32172%2C35652%2C4471%23department%3D2311%23team%3D1441%23eivr%3D%23position%3DJ868301%23; CFGLOBALS=urltoken%3DCFID%23%3D3180505%26CFTOKEN%23%3D47387534%23lastvisit%3D%7Bts%20%272022%2D06%2D21%2012%3A21%3A19%27%7D%23hitcount%3D47%23timecreated%3D%7Bts%20%272022%2D05%2D31%2010%3A58%3A52%27%7D%23cftoken%3D47387534%23cfid%3D3180505%23; _ga=GA1.1.1849331890.1666123379; com.silverpop.iMAWebCookie=e28825bd-04a1-e128-6d28-ec6498017779; _ga_BL8HZZJ5X4=GS1.1.1680100901.2.0.1680100901.0.0.0; _fbp=fb.1.1693919665758.1258355006; _ga_58T88XBVN1=GS1.1.1693933404.10.1.1693933404.60.0.0; 2d18f267facdf29e764fe65056416803=ecddd3bdebb6688bb8aaeffa00169634; userLoggedIn=true; awx_sessionid=1ppd0stpfenm8o1y35s2xb5uqxe5a2ts; csrftoken=NKdsvTELFbhMSumG0aDfJLEw3im0HCDj
Sec-Ch-Ua:
"Chromium";v="116", "Not)A;Brand";v="24", "Google Chrome";v="116"
Sec-Ch-Ua-Mobile:
?0
Sec-Ch-Ua-Platform:
"macOS"
Sec-Fetch-Dest:
empty
Sec-Fetch-Mode:
cors
Sec-Fetch-Site:
same-origin
User-Agent:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
X-Csrftoken:
NKdsvTELFbhMSumG0aDfJLEw3im0HCDj

Response Headers:
Access-Control-Expose-Headers:
X-API-Request-Id
Allow:
GET, POST, HEAD, OPTIONS
Content-Language:
en
Content-Length:
2266
Content-Type:
application/json
Date:
Tue, 26 Sep 2023 16:27:02 GMT
Location:
/api/v2/project_updates/89960/
Server:
nginx
Session-Timeout:
7200
Set-Cookie:
awx_sessionid=1ppd0stpfenm8o1y35s2xb5uqxe5a2ts; expires=Tue, 26 Sep 2023 18:27:02 GMT; HttpOnly; Max-Age=7200; Path=/; SameSite=Lax
Vary:
Accept, Accept-Language, Origin, Cookie
X-Api-Node:
awxpoc-web-8f97cb7fc-khxrp
X-Api-Product-Name:
AWX
X-Api-Product-Version:
23.1.0
X-Api-Request-Id:
3a0474daad7d4f6ea3368ab60cd6e50b
X-Api-Time:
0.198s
X-Api-Total-Time:
0.373s

Response:
{
    "project_update": 89960,
    "id": 89960,
    "type": "project_update",
    "url": "/api/v2/project_updates/89960/",
    "related": {
        "created_by": "/api/v2/users/82/",
        "modified_by": "/api/v2/users/82/",
        "credential": "/api/v2/credentials/72/",
        "unified_job_template": "/api/v2/projects/527/",
        "stdout": "/api/v2/project_updates/89960/stdout/",
        "project": "/api/v2/projects/527/",
        "cancel": "/api/v2/project_updates/89960/cancel/",
        "scm_inventory_updates": "/api/v2/project_updates/89960/scm_inventory_updates/",
        "notifications": "/api/v2/project_updates/89960/notifications/",
        "events": "/api/v2/project_updates/89960/events/"
    },
    "summary_fields": {
        "organization": {
            "id": 7,
            "name": "CPSE",
            "description": "Echo team"
        },
        "project": {
            "id": 527,
            "name": "ap_azure-automation_using_awxnewkey",
            "description": "New SSH Key",
            "status": "pending",
            "scm_type": "git",
            "allow_override": false
        },
        "credential": {
            "id": 72,
            "name": "awxnewkey",
            "description": "",
            "kind": "scm",
            "cloud": false,
            "kubernetes": false,
            "credential_type_id": 2
        },
        "unified_job_template": {
            "id": 527,
            "name": "ap_azure-automation_using_awxnewkey",
            "description": "New SSH Key",
            "unified_job_type": "project_update"
        },
        "created_by": {
            "id": 82,
            "username": "j8683",
            "first_name": "Radesh",
            "last_name": "Singh"
        },
        "modified_by": {
            "id": 82,
            "username": "j8683",
            "first_name": "Radesh",
            "last_name": "Singh"
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        }
    },
    "created": "2023-09-26T16:27:02.231149Z",
    "modified": "2023-09-26T16:27:02.254825Z",
    "name": "ap_azure-automation_using_awxnewkey",
    "description": "New SSH Key",
    "local_path": "_527__ap_azure_automation_1033183318_am",
    "scm_type": "git",
    "scm_url": "g...@github.com:csx-technology/ap_azure-automation.git",
    "scm_branch": "main",
    "scm_refspec": "",
    "scm_clean": false,
    "scm_track_submodules": false,
    "scm_delete_on_update": false,
    "credential": 72,
    "timeout": 0,
    "scm_revision": "",
    "unified_job_template": 527,
    "launch_type": "manual",
    "status": "pending",
    "execution_environment": null,
    "failed": false,
    "started": null,
    "finished": null,
    "canceled_on": null,
    "elapsed": 0.0,
    "job_args": "",
    "job_cwd": "",
    "job_env": {},
    "job_explanation": "",
    "execution_node": "",
    "result_traceback": "",
    "event_processing_finished": false,
    "launched_by": {
        "id": 82,
        "name": "j8683",
        "type": "user",
        "url": "/api/v2/users/82/"
    },
    "work_unit_id": null,
    "project": 527,
    "job_type": "check",
    "job_tags": "update_git,install_roles,install_collections"
}

callm...@gmail.com

unread,
Sep 27, 2023, 11:09:46 AM9/27/23
to AWX Project
I'm looking for the link, but I came across a post from a user who upgraded to v22, and seemed to experience an issue syncing projects.
From what I recall, they killed the task pod and was able to sync.
I just did that, and am able to sync some projects.

I don't know how long it will work before I need to kill the pod again, but at least it appears to be a workaround.

Shawn

callm...@gmail.com

unread,
Sep 27, 2023, 11:25:36 AM9/27/23
to AWX Project
ok, I found the link; however, it was for another error I saw when I used Chrome to look at what my browser was seeing when I do a sync:
https://github.com/ansible/awx/issues/13978

At the end, a poster mentioning restarting the task container.

Will update this thread with:

1. Whether the "bandaid" continues to work.
2. Additional info.

Shawn

AWX Project

unread,
Sep 29, 2023, 1:46:34 PM9/29/23
to AWX Project
PLAY [Update source tree if necessary] *****************************************

TASK [Update project using git] ************************************************

is that the full stdout you see in the UI for that project update? And when you go re-launch, does it consistently fail on that same task? Also does the project update end in a 'Failed' or 'Error' status?

AWX Team

callm...@gmail.com

unread,
Oct 5, 2023, 9:44:11 AM10/5/23
to AWX Project
I'm sorry, I found a solution.
I by increasing limits and quotas, the issue went away.

Shawn
Reply all
Reply to author
Forward
0 new messages