[AWX 20+] Cannot extend base image anymore, as it seems network is unreachable since the migration to CentOS 9

63 views
Skip to first unread message

Maxence BUTTON

unread,
Mar 23, 2023, 4:31:21 AM3/23/23
to AWX Project
Hi,

I recently opened a bug (which is not really a bug, more a question) and I was advised to post it here as it seems more relevant.

Here is the summary :

I've encountered a weird behavior while migrating from AWX 17 to 21.11.0.

We are not directly using the AWX base image in the Operator, but we are first extending it with some packages (to suit my client needs).
In 17, we used to define our own YUM repo and installed those packages. Everything was smooth and we could generate our own image without any problem.

We decided to migrate to the latest version (21.11.0 at the time of writing). And then our build pipeline failed. At first, we thought it was because the new internal mirrors towards CentOS9 repo were not up, but that's not it : no external URL is reachable.

I tried to diagnose the problem, by logging into the image and and performing some network investigations but as the image does not have any network tool (ping, ip, host, dig, tracepath ...) it's very hard to tell what's wrong. I checked the resolv.conf, the hosts file, the selinux config, access.conf, etc and nothing obvious came out of it.

I checked all the versions and the problem seems to appear in version 20 (with the switch to CentOS9).
Again, I may miss something obvious but I carefully read the docs (AWX + CentOS), browsed the current issues and couldn't find the slightest clue.

For now, as I'm in an early stage, I just dropped the installation of additional packages, but as they were security related, I won't be able to go in production without them.


All is detailed here :

Thanks in advance for any piece of information, advice or experience on that.

kurokobo

unread,
Mar 23, 2023, 8:43:01 AM3/23/23
to awx-p...@googlegroups.com
Hi,

getaddrinfo() thread failed to start

What version of Docker are you using?
Some times old Docker causes similar issue since new glibc installed in CentOS 9 can't be worked on old Docker.
I'd recommend you to try it again with newer Docker.

If your issue still exists with the latest Docker, you should start your investigation with plain CentOS Stream 9 image instead of AWX.

Regards,

--
@kurokobo


------- Original Message -------
--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/fd75804f-783b-4c78-be5f-0c0af3f4d01fn%40googlegroups.com.

Maxence BUTTON

unread,
Mar 23, 2023, 11:59:18 AM3/23/23
to AWX Project
Hi,

Nice suggestion, I will try that right away and give you an updated status.

Thanks !

Maxence BUTTON

unread,
Mar 23, 2023, 1:06:37 PM3/23/23
to AWX Project
Alas, I have reached the same conclusion.

I was initially working with a RHEL 7 machine with Docker 18.03.
My latest test was on a RHEL 8, with Podman 2.0.5.

Ok, I will investigate directly with a raw CentOS 9 image and see what I can do with that.

Thanks for the reply anyway.

kurokobo

unread,
Mar 23, 2023, 2:35:37 PM3/23/23
to awx-p...@googlegroups.com
Hi,

Both Docker 18.03 and Podman 2.0.5 are too old :(
I don't think such old Docker or Podman can handle the security hardened wrapper for syscalls implemented in glibc 2.34.
Simply you shoud try more newer one.

Regards,
 
--
@kurokobo

------- Original Message -------

Maxence BUTTON

unread,
Mar 24, 2023, 6:12:47 AM3/24/23
to AWX Project
Hey,

Thanks for your input. I will try to see what are my options here as my client has a determined path regarding the upgrade of packages.
But you are right, it has nothing to do with AWX, it's more a matter of CentOS 9 and the container runtime.

I guess we can passivate the thread for now, but I will post the information on my future tests here and in the "bug" I opened.

Thanks again for your valuable input @kurokobo.

Maxence BUTTON

unread,
Mar 24, 2023, 10:00:43 AM3/24/23
to AWX Project
Hi again,

Ok, I found an alternate repo to get a more recent podman package (4.2.0) but unfortunately, the result remains unchanged :

[root@max-rhel8 ~]# podman --version
podman version 4.2.0
[root@max-rhel8 ~]# podman run --rm -it --entrypoint=bash d7456a00e6af
bash-5.1$ curl -kv https://artifactory.internal.com/artifactory
* Could not resolve host:  artifactory.internal.com
* Closing connection 0
curl: (6) Could not resolve host:  artifactory.internal.com
bash-5.1$

But you're right, I'll dig deeper with a base Centos 9 image.

Anyway, thanks again for your suggestions.

kurokobo

unread,
Mar 24, 2023, 12:30:26 PM3/24/23
to awx-p...@googlegroups.com
Hi,

the result remains unchanged

I see your error has been changed.

- On old Docker / Podman:
  > curl: (6) getaddrinfo() thread failed to start
- On newer Podman
  > curl: (6) Could not resolve host: artifactory.internal.com

So I think your initial issue has been solved on newer Podman but there is a different issue now.
I guess it's DNS related issue. Try double-checking DNS settings inside the container or around Podman.

Regards,

--
@kurokobo


------- Original Message -------
Reply all
Reply to author
Forward
Message has been deleted
0 new messages