RuntimeError: Error streaming stdin to pod

278 views
Skip to first unread message

Caleb Brewer (Archery)

unread,
Jul 12, 2023, 12:09:41 PM7/12/23
to AWX Project
When running a playbook, I get this error...

RuntimeError: Error streaming stdin to pod awx/automation-job-96-p897n. Error: error sending request: Post "https://10.0.0.1:443/api/v1/namespaces/awx/pods/automation-job-96-p897n/attach?container=worker&stdin=true": read tcp 10.231.80.16:46144->10.0.0.1:443: read: connection timed out

  • 10.231.80.16:46144 Is the IP of the task POD
  • 10.0.0.1 is the K8S POD
  • I have been able to test connectivity between the POD and the K8S server
  • The automation-job-96-p897n is created and runs for the duration of the job
Contents of the playbook:
---
- name: Hello World!
  hosts: server01

  tasks:

  - name: Hello World!
    shell: echo "Hi! Tower is working.

AWX Project

unread,
Jul 14, 2023, 1:40:43 PM7/14/23
to AWX Project
so did the job run successful or end in a failed / error state? In your awx-ee container logs do you see "retry" messages for this job? e.g.

Error streaming stdin to pod ... Will retry 5 more times.

does this problem happen sporadically or every time?

what k8s cluster are you on (k3s, k3d, etc)

AWX Team

Caleb Brewer (Archery)

unread,
Jul 14, 2023, 4:46:18 PM7/14/23
to AWX Project
Thanks for your help.
We are running AWX on Azure AKS.
The job fails every time. 

Here are the output logs from the job...

Traceback (most recent call last): File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 435, in _run_internal lines = resultfile.readlines() OSError: read() should have returned a bytes object, not 'NoneType' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/jobs.py", line 597, in run res = receptor_job.run() File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 317, in run res = self._run_internal(receptor_ctl) File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 444, in _run_internal raise RuntimeError(detail) RuntimeError: Error streaming stdin to pod awx/automation-job-173-8rsrb. Error: error sending request: Post "https://10.0.0.1:443/api/v1/namespaces/awx/pods/automation-job-173-8rsrb/attach?container=worker&stdin=true": read tcp 10.231.80.47:44656->10.0.0.1:443: read: connection timed out

AWX Project

unread,
Jul 19, 2023, 2:43:13 PM7/19/23
to AWX Project
At first glance this seems like a kubernetes level problem, not specifically AWX.

So the automation-job pods are staying up indefinitely, even after these stdin POST timeout errors?

does kubectl logs on that automation job pod show you stdout? do you see ansible events?

Also, as a side note, you may want to enabled RECEPTOR_KUBE_SUPPORT_RECONNECT, see this PR https://github.com/ansible/receptor/pull/683 But I don't think it is related to your current issue

AWX Team
Reply all
Reply to author
Forward
0 new messages