Ansible Tower Plugin communication issue between Jenkins and AWX

130 views
Skip to first unread message

Richard Lund

unread,
May 13, 2021, 8:15:38 AM5/13/21
to Jenkins Users
Hi.

We use Jenkins to build our apps and then start AWX playbooks to deploy the built app to Openshift. For that, we use the Ansible Tower plugin in Jenkins and access it through pipeline code.

Randomly, we get what looks like communication issues between AWX and Jenkins.

The pipeline will fail with the error:

"ERROR: Failed to get job status from Tower"

Often it will add a 503 error with the same message.

I have checked all the Jenkins logs I can get to, but I can't find more details on this.

AWX is deployed in Openshift as Stateful Set, and I have scanned through awx:celery and awx:web to try and track this down, but the only mention of a 503 is this:

[Ansible-Tower] Building GET request to https://awx/api/v2/jobs/57602/
[Ansible-Tower] Forcing cert trust
[Ansible-Tower] Request completed with (503)
[Ansible-Tower] Deleting oAuth token 15396 for awx
[Ansible-Tower] Forcing cert trust
[Ansible-Tower] Calling for oAuth token delete at https://awx/api/v2/tokens/15396/
[Ansible-Tower] Request completed with (200)

This seems to happen the most while AWX is waiting for Openshift to finish deployment. The job ALWAYS finishes normally - yet Jenkins receives this error and fails the pipeline even though everything else worked. It also seems to happen on apps that take the longest to deploy - maybe 10-15 minutes.

We had this issue last year, then it went away, now it came back, and it is affecting our work when pipelines are marked as failed and we have to constantly double-check if the job actually finished or not.

I have restarted Jenkins and AWX, and I have updated the Ansible Tower plugin to the newest version - no change.

AWX is on version 9.0.0.0 (installed in Openshift) with Ansible 2.8.5.

Jenkins is on version 2.235.1 (also installed in Openshift).

Ansible Tower Plugin is on version 0.16.0.

I'd appreciate any help or pointers you can provide that can help me track this down.

Thanks!

Richard Lund

unread,
May 27, 2021, 11:50:51 AM5/27/21
to Jenkins Users
Quick update: I just had the pipeline fail NOT during a long wait but while Ansible was setting a variable only a few seconds after the playbook was started, so the issue does NOT appear to have to do with long wait times.

Richard Lund

unread,
Jul 14, 2021, 12:55:19 PM7/14/21
to Jenkins Users
Bump - still hoping someone can help me with this.

Thanks!

On Thursday, May 13, 2021 at 7:15:38 AM UTC-5 Richard Lund wrote:
Reply all
Reply to author
Forward
0 new messages