postgres pod Failed Scheduling - - 1 node(s) didn't find available persistent volumes to bind.

1,773 views
Skip to first unread message

Andrei Mihai

unread,
Sep 12, 2022, 5:32:50 AM9/12/22
to AWX Project
Hello AWX Team,

ENV is a kubernetes cluster(not minikube), on oracle linux 8.
I have tried the same on Ubuntu and get into the same issue.

I've followed all the guides and it seems like the postgress pod fails scheduling, while the PVC waits for the pod to be scheduled. The PVC/PV & Storage Class are all created and I even have another PV/PVC/SC created for the projects storage, that are bound correctly.

Here are all the details I could muster.
If I can provide more, I am happy to. I've obfuscated some sensitive details with ****

[sysadmin@dev-awx-01 k8awx]$ kubectl describe pod ****-postgres-13-0
Name:           ****-postgres-13-0
Namespace:      awx
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/component=database
                app.kubernetes.io/instance=postgres-13-****
                app.kubernetes.io/managed-by=awx-operator
                app.kubernetes.io/name=postgres-13
                app.kubernetes.io/part-of=****
                controller-revision-hash=****-postgres-13-8677ccdd5d
                statefulset.kubernetes.io/pod-name=****-postgres-13-0
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/****-postgres-13
Containers:
  postgres:
    Image:      postgres:13
    Port:       5432/TCP
    Host Port:  0/TCP
    Requests:
      cpu:     10m
      memory:  64Mi
    Environment:
      POSTGRESQL_DATABASE:        <set to the key 'database' in secret '****-postgres-configuration'>  Optional: false
      POSTGRESQL_USER:            <set to the key 'username' in secret '****-postgres-configuration'>  Optional: false
      POSTGRESQL_PASSWORD:        <set to the key 'password' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_DB:                <set to the key 'database' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_USER:              <set to the key 'username' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_PASSWORD:          <set to the key 'password' in secret '****-postgres-configuration'>  Optional: false
      PGDATA:                     /var/lib/postgresql/data/pgdata
      POSTGRES_INITDB_ARGS:       --auth-host=scram-sha-256
      POSTGRES_HOST_AUTH_METHOD:  scram-sha-256
    Mounts:
      /var/lib/postgresql/data from postgres-13 (rw,path="data")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-glhch (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  postgres-13:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  postgres-13-****-postgres-13-0
    ReadOnly:   false
  kube-api-access-glhch:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  9s (x2 over 96s)  default-scheduler  0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.

I have two storage classes created. They are the same, except the bind mode. If I set the bind mode to Immediate, for the postgres SC, the postgres pod fails to scheduled with pod has unbound immediate PersistentVolumeClaims.
The projects-storage will bind regardless of binding mode.

[sysadmin@dev-awx-01 k8awx]$ kubectl get sc
NAME               PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
postgres-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  2m53s
projects-storage   kubernetes.io/no-provisioner   Delete          Immediate              false                  2m53s

The PVs are the same. They are just named differently and point to their own storage class.

[sysadmin@dev-awx-01 k8awx]$ cat projects-storage-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: projects-storage-pv
  namespace: awx
spec:
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 20Gi
  volumeMode: Filesystem
  storageClassName: projects-storage
  local:
    path: /data/awx
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - dev-awx-0

[sysadmin@dev-awx-01 k8awx]$ cat postgres-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-storage-pv
spec:
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 20Gi
  volumeMode: Filesystem
  storageClassName: postgres-storage
  hostPath:
    path: /data/postgress
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - dev-awx-0

The PVs get created properly.

We are using local storage for this:
[sysadmin@dev-awx-01 k8awx]$ df -h
/dev/mapper/ansible-awx--storage        25G  211M   25G   1% /data/awx
/dev/mapper/ansible-postgres--storage   25G  211M   25G   1% /data/postgress
[sysadmin@dev-awx-01 k8awx]$

[sysadmin@dev-awx-01 k8awx]$ kubectl get pv
NAME                  CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                      STORAGECLASS       REASON   AGE
postgres-storage-pv   20Gi       RWX            Retain           Available                              postgres-storage            3m
projects-storage-pv   20Gi       RWX            Retain           Bound       awx/projects-storage-pvc   projects-storage            3m

However, the PVC for postgres-13-*****-postgres-13-0 will not bind.

[sysadmin@dev-awx-01 k8awx]$ kubectl get pvc
NAME                              STATUS    VOLUME                CAPACITY   ACCESS MODES   STORAGECLASS       AGE
postgres-13-*****-postgres-13-0   Pending                                                   postgres-storage   2m40s
projects-storage-pvc              Bound     projects-storage-pv   20Gi       RWX            projects-storage   3m3s


Describing the PVC, shows it's waiting for the POD to be scheduled.

[sysadmin@dev-awx-01 k8awx]$ kubectl describe pvc postgres-13-****-postgres-13-0
Name:          postgres-13-****-postgres-13-0
Namespace:     awx
StorageClass:  postgres-storage
Status:        Pending
Volume:
Labels:        app.kubernetes.io/component=database
               app.kubernetes.io/instance=postgres-13-****
               app.kubernetes.io/managed-by=awx-operator
               app.kubernetes.io/name=postgres-13
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       ****-postgres-13-0
Events:
  Type    Reason                Age                     From                         Message
  ----    ------                ----                    ----                         -------
  Normal  WaitForFirstConsumer  9m29s                   persistentvolume-controller  waiting for first consumer to be created before binding
  Normal  WaitForPodScheduled   3m25s (x25 over 9m25s)  persistentvolume-controller  waiting for pod ****-postgres-13-0 to be scheduled

Below is the awx yaml and the kustomization.yaml.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: ****
  namespace: awx
spec:
  service_type: nodeport
  ingress_type: ingress
  hostname: ansible-lower.****.com
  ingress_tls_secret: ****-tls
  auto_upgrade: true

  projects_persistence: true
  projects_existing_claim: projects-storage-pvc

  postgres_storage_class: postgres-storage

  set_self_labels: false
  no_log: 'false'


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  # Find the latest tag here: https://github.com/ansible/awx-operator/releases
  - github.com/ansible/awx-operator/config/default?ref=0.28.0
  - projects-sc.yaml
  - postgres-sc.yaml
  - projects-storage-pv.yaml
  - postgres-pv.yaml
  - projects-storage-pvc.yaml
  - awx.yaml



# Set the image tags to match the git version from above
images:
  - name: quay.io/ansible/awx-operator
    newTag: 0.28.0

# Specify a custom namespace in which to install AWX
namespace: awx

I would appreciate any help/advice on this.

Thanks in advance.

AWX Project

unread,
Sep 16, 2022, 6:02:54 PM9/16/22
to AWX Project
I noticed that there is a typo in your hostPath for the postgres PVC, perhaps that could be part of the issue.  (/data/postgres instead of /data/postgress).  

I am a bit confused here.  The StorageClass should dynamically provision the PVC as need for  your when AWX is deployed.  Perhaps we could ass `postgres_pvc_claim` as an option on the AWX spec so that users could pre-create their own pv and pvc for postgres.  Please open an issue in the awx-operator repo if that is something you would find useful.  

Thanks,
AWX Team

Andrei Mihai

unread,
Sep 19, 2022, 10:36:13 AM9/19/22
to AWX Project
Hi there,

Thank you for getting back to me.

The typo should not matter, as this is the name of the path /data/postgress instead of /data/postgres. I accidentally typed another s in there. The manifests all point to the correct path.

"The StorageClass should dynamically provision the PVC as need for  your when AWX is deployed."
That's the thing. It doesn't. If you can tell me there is something wrong with my manifests, I'm more than willing to redo them.

"Perhaps we could ass `postgres_pvc_claim` as an option on the AWX spec so that users could pre-create their own pv and pvc for postgres. "
I actually looked for that and could not find it and wondered why it does not exist. In my case, I am deploying on bare-metal/vms and I don't necessarily have access to cloud storage types that would make a lot of sense, performance-wise, in my deployment.

In any case, I was able to bypass this with an external postgresql database. I scripted the setup so whenever I deploy a new AWX instance, I also get a new psql instance running with a new db, role, access permissions and all that jazz.

"Please open an issue in the awx-operator repo if that is something you would find useful.  "

I will. :)

Cheers!
Reply all
Reply to author
Forward
0 new messages