Stumped on how to restore AWX backup to minikube running on mac laptop.

718 views
Skip to first unread message

RedCrick

unread,
Apr 17, 2023, 1:32:09 PM4/17/23
to AWX Project
Hi,

I have an AWX-operator deployment running and I am testing my disaster recovery procedures.  I have made a backup of my AWX operator deployment and move the backup files to my laptop:

[r...@BP22006.local tower-openshift-backup-2023-04-14-16:06:26]$ pwd /Users/russell.cecala/AWX/BACKUP_RESTORE/rocky/pvc-37ccc9bf-1ffb-4565-bd31-5c3eb1ad21cb_awx_awx-demo-backup-claim/tower-openshift-backup-2023-04-14-16:06:26 [r...@BP22006.local tower-openshift-backup-2023-04-14-16:06:26]$ ls -l total 164360 -rw-r--r--  1 red  staff      1118 Apr 17 09:57 awx_object -rw-r--r--  1 red  staff     13164 Apr 17 09:57 secrets.yml -rw-r-----  1 red  staff  72039972 Apr 17 09:57 tower.db


I think now I need to create a PVC on my laptop's minikube system like so ...

$ cat laptop-pvc.yml apiVersion: v1 kind: PersistentVolumeClaim metadata:   name: awx-backup spec:   accessModes:     - ReadWriteOnce   resources:     requests:       storage: 1Gi

$ kubectl apply -f laptop-pvc.yml
persistentvolumeclaim/awx-backup created

But when I do a describe on the PV I see a Path that does not exists!

$ kubectl describe pv pvc-3910cc84-e2bd-47cb-9830-8b13516cd56f | grep Path Annotations:     hostPathProvisionerIdentity: ced76979-2cc1-4903-891e-9a30e656cf5b     Type:          HostPath (bare host directory volume)     Path:          /tmp/hostpath-provisioner/default/awx-backup     HostPathType:
$ ls -l /tmp/hostpath-provisioner/default/awx-backup ls: /tmp/hostpath-provisioner/default/awx-backup: No such file or directory

Do I need to create that path first?
Please help. I am getting very confused and cannot figure out how to do a simple restore from a backup.


Michael Kelly

unread,
Apr 18, 2023, 6:56:57 AM4/18/23
to awx-p...@googlegroups.com
I'm pretty sure that you need to create the path first.
As well as the PVC, you also need to define a PV.
Have a look at the pv.yaml and pvc.yaml files in https://github.com/kurokobo/awx-on-k3s/tree/main/base.
The storageClassName links the pvc to the pv.

--
You received this message because you are subscribed to the Google Groups "AWX Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to awx-project...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/awx-project/c6e17652-856c-40f9-ae68-59a4a768212an%40googlegroups.com.

RedCrick

unread,
Apr 18, 2023, 11:16:32 AM4/18/23
to AWX Project
Thanks for the reply Micheal, and thanks for the link to your github.  Lots of good info there.
Please let me show you here what I am doing and maybe you can see where I am going wrong.

Here are the file I need to restore.

bpadmin@minikube:~/RESTORE/FILES$ ls -l total 70376 -rw-r--r-- 1 bpadmin bpadmin     1118 Apr 17 13:13 awx_object -rw-r--r-- 1 bpadmin bpadmin    13164 Apr 17 13:13 secrets.yml -rw-r----- 1 bpadmin bpadmin 72039972 Apr 17 13:13 tower.db

I have created the /data dir's as you suggested:

sudo mkdir -p /data/postgres-13 sudo mkdir -p /data/projects sudo chmod 755 /data/postgres-13 sudo chown 1000:0 /data/projects

I have create the PVC and PV.

bpadmin@minikube:~$ kubectl get pv NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                        STORAGECLASS          REASON   AGE awx-postgres-13-volume   8Gi        RWO            Retain           Available                                awx-postgres-volume            10m awx-projects-volume      2Gi        RWO            Retain           Bound       default/awx-projects-claim   awx-projects-volume            10m bpadmin@minikube:~$ kubectl get pvc NAME                 STATUS   VOLUME                CAPACITY   ACCESS MODES   STORAGECLASS          AGE awx-projects-claim   Bound    awx-projects-volume   2Gi        RWO            awx-projects-volume   10m

Now here is where I think I go wrong.

I copy my backup files to the /data dirs:

bpadmin@minikube:~$ tree /data /data ├── postgres-13 │   └── tower.db └── projects     ├── awx_object     └── secrets.yml

Then I do

bpadmin@minikube:~$ kubectl apply -k restore error: must build at directory: not a valid directory: evalsymlink failure on 'restore' : lstat /home/bpadmin/restore: no such file or directory

But as you see I get an error message. I am really don't get what I am supposed to do. Any help would be great. :)

Michael Kelly

unread,
Apr 18, 2023, 1:43:12 PM4/18/23
to awx-p...@googlegroups.com
How did you setup AWS and generate a backup?

RedCrick

unread,
Apr 18, 2023, 2:02:33 PM4/18/23
to AWX Project
First, thanks for the reply :)  
My source system is an awx-operator running on a 3 node k3 cluster.  Basically it is the awx-demo deployment used in the tutorial for setting up AWX-opeator.
I created a back up of this AWX server by creating this awx-backup.yml file:

$ cat backup-awx.yml --- apiVersion: awx.ansible.com/v1beta1 kind: AWXBackup metadata:   name: awxbackup-2023-04-13   namespace: awx spec:   deployment_name: awx-demo

And then applied that file like so:
kubectl apply -f backup-awx.yml
...

This created the backup files on one of my k3 nodes:

[root@rocky-k3-2 ~]# cd /var/lib/rancher/k3s/storage/pvc-37ccc9bf-1ffb-4565-bd31-5c3eb1ad21cb_awx_awx-demo-backup-claim/tower-openshift-backup-2023-04-14-16\:06\:26 13
[root@rocky-k3-2 tower-openshift-backup-2023-04-14-16:06:26]# ls -l 15-rw-r--r-- 1 root root 1118 Apr 14 09:06 awx_object 16-rw-r--r-- 1 root root 13164 Apr 14 09:06 secrets.yml 17-rw-rw---- 1 root root 72039972 Apr 14 09:06 tower.db

I then set up a completely new system with minikube on it so I can test these backup files and test my restore procedure.
And I scp'ed the awx_object, secrets.yml and tower.db files to my new system.  I want to load my backup files to my new minikube cluster
so I can show the backup works.

Christian Adams

unread,
Apr 18, 2023, 2:31:54 PM4/18/23
to awx-p...@googlegroups.com
Hi @RedCrick,

A bit of context first: The main use case for the AWXBackup and AWXRestore objects is to take a backup you can restore from on the same cluster, typically during upgrades. Before you do an upgrade, take a backup, and if the upgrade causes unanticipated issues, you can always delete the AWX CR and restore from the AWXBackup object (which has knowledge of the backup PVC and correct backup directory on it).

So a typical user would have an AWXRestore object like this, where the "backup_name" is the name of the AWXBackup object.

---
kind: AWXRestore
metadata:
  name: restore1
  namespace: awx
spec:
  deployment_name: awx-new
  backup_name: awxbackup-2023-04-13


Links to docs:

However, it seems like you want to migrate to a different cluster entirely.  For this, you have a few options I know of:
  • Use the migration logic, which requires both clusters to be up simultaneously.  Docs here.
  • OR do what you are attempting to do, which would allow you to restore your AWX even if your original cluster is not available (assuming you have the contents of the backup somewhere accessible, as you described).
  • Alternatively, you could achieve this by taking an infrastructure as code approach by: A) using the awx.awx import/export modules (downside is credentials won't be exported), or B) define infrastructure as code using the awx-resource-operator, a k8s-native way to define the resources (projects, jobtemplate, credentials, etc.) in your AWX instance.  The resource operator is not fully feature complete yet, but is rapidly approaching it.  

---------------------------------------

All that information aside, if you continue with your current approach, you will need to specify the following on your AWXRestore object yaml:

---
kind: AWXRestore
metadata:
  name: restore1
  namespace: awx
spec:
  deployment_name: awx-new
  
  backup_pvc: awx-backup-volume-claim
  backup_dir: /backups/tower-openshift-backup-2023-04-14-16:06:26

Hopefully this information helps,
AWX Team








--
__________________________________________________
Christian M. Adams
Software Engineer at Ansible - Red Hat
cha...@redhat.com  |  (919) 218-5080  |  GitHub: rooftopcellist

RedCrick

unread,
Apr 18, 2023, 7:23:23 PM4/18/23
to AWX Project
Thank you Christian, 

I must have done something wrong as I tried running AWX Restore object yaml you had suggested.
What do you think my chances are, if on my minikube cluster I create an AWXbackup and then overwrite the backup files that get created with the backup files from my old AWX server and then do a restore so as to imitate the main use case?

Michael Kelly

unread,
Apr 19, 2023, 8:15:33 AM4/19/23
to awx-p...@googlegroups.com
Christian,
thank you for clarifying the use of the AWXBackup and AWXRestore objects.

I had been thinking about how I might use the AWXBackup resource to implement scheduled backups but my knowledge of Kubernetes is not great, so I started looking for other options.
There are a few.
Two in particular stood out.

One is called Velero.
This is an Open Source project and ongoing development is very active, 37 open PRs, 400+ issues, most recent commit was yesterday.
Backups are stored in the cloud.

The other one is Kasten K10.
This is a commercial product but it does have a free version that you can use with up to 5 nodes.
However it comes with a fairly comprehensive EULA that must be accepted before you can start to use it.
I'm waiting on feedback from our legal eagle  before proceeding.
One thing that I do like about it is that, in addition to using cloud storage, you can also store backups in NFS storage.

Hope this helps.

RedCrick

unread,
Apr 19, 2023, 12:59:46 PM4/19/23
to AWX Project
Hi All,  I tried my approach of "faking out" the AWXbackup/Awxrestore use case.  

I believe I deleted the postgres PVC and the awx-demo "CR's" by deleting the PVC `kubectl delete pvc postgres-13-awx-demo-postgres-13-0 -n awx`. but it only ever went to "Terminating" status.  I also deleted the awx-demo deployment.  Then copied of the backup files on the my minikube's host 

/home/bpadmin/.local/share/docker/volumes/minikube/_data/hostpath-provisioner/awx/awx-demo-backup-claim/tower-openshift-backup-2023-04-19-02* 

Then I create an AWXrestore like so:

bpadmin@minikube:~/RESTORE/KOBO$ kubectl get awxbackup -n awx NAME                   AGE awxbackup-2023-04-18   14h bpadmin@minikube:~/RESTORE/KOBO$ cat restore-awx.yml --- apiVersion: awx.ansible.com/v1beta1 kind: AWXRestore metadata:   name: restore1   namespace: awx spec:   deployment_name: awx-demo   backup_name: awxbackup-2023-04-18

$ kubectl apply -f restore-awx.yml

Then I watched pods come and go ...

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx 
Wed 19 Apr 2023 09:43:04 AM PDT 
NAME                                               READY   STATUS    RESTARTS      AGE 
awx-demo-postgres-13-0                             1/1     Running   0             36m 
awx-operator-controller-manager-7d79f6f96d-r2gr7   2/2     Running   3 (70m ago)   15h 

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx 
Wed 19 Apr 2023 09:43:11 AM PDT 
NAME                                               READY   STATUS              RESTARTS      AGE 
awx-demo-postgres-13-0                             1/1     Running             0             36m 
awx-operator-controller-manager-7d79f6f96d-r2gr7   2/2     Running             3 (70m ago)   15h 
restore1-db-management                             0/1     ContainerCreating   0             4s

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx Wed 19 Apr 2023 09:44:04 AM PDT 
NAME                                               READY   STATUS    RESTARTS      AGE 
awx-demo-757b674d65-gskgf                          4/4     Running   0             35s 
awx-demo-postgres-13-0                             1/1     Running   0             37m 
awx-operator-controller-manager-7d79f6f96d-r2gr7   2/2     Running   3 (71m ago)   15h 
restore1-db-management                             1/1     Running   0             56s 

Then I try to see what's going on on my EE container ...

bpadmin@minikube:~/RESTORE/KOBO$ kubectl exec -it awx-demo-757b674d65-gskgf -n awx -c awx-demo-ee -- bash 
bash-5.1$ ls -l /var/lib/awx/projects/ 
total 0 
bash-5.1$ ls -la /var/lib/awx/projects/ 
total 8 
drwxrwxrwx 2 root root 4096 Apr 19 16:43 . 
drwxr-xr-x 3 root root 4096 Apr 19 16:43 .. 

bash-5.1$ command terminated with exit code 137 bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx Wed 19 Apr 2023 09:44:52 AM PDT NAME                                               READY   STATUS    RESTARTS      AGE awx-demo-postgres-13-0                             1/1     Running   0             38m awx-operator-controller-manager-7d79f6f96d-r2gr7   2/2     Running   3 (71m ago)   15h restore1-db-management                             1/1     Running   0             104s bpadmin@minikube:~/RESTORE/KOBO$

And this cycle keeps happening.  Is there someway I can tell if the restore will work or if something has gone awry?
Message has been deleted

AWX Project

unread,
Apr 19, 2023, 2:44:23 PM4/19/23
to AWX Project
Hi RedCrick,

You could inspect the awx-operator logs to see the error, I bet the restore role has a failing task that is causing the management pod to keep coming up.  Can you paste any errors here?

Thanks,
AWX Team

RedCrick

unread,
Apr 19, 2023, 3:14:56 PM4/19/23
to AWX Project
I am not sure I am grabbing the correct info but here is what I see:

bpadmin@minikube:~$ kubectl logs awx-operator-controller-manager-7d79f6f96d-r2gr7 -n awx
... lots and lots of log output ... 
TASK [restore : Restore database dump to the new postgresql container] ********* task path: /opt/ansible/roles/restore/tasks/postgres.yml:84\nfatal: [localhost]: FAILED! => {\"censored\": \"the output has been hidden due to the fact that 'no_log: true' was specified for this result\", \"changed\": true}\n PLAY RECAP ********************************************************************* localhost                  : ok=47   changed=10   unreachable=0    failed=1    skipped=13   rescued=0    ignored=0   \r\n\n",    "name" : "restore1",    "namespace" : "awx",    "ts" : 1681927674.66927 }

Do you think that has something to do with the fact the old postgres PVC delete job never did terminate?

RedCrick

unread,
Apr 20, 2023, 11:31:26 AM4/20/23
to AWX Project
Ok.  I was finally able to figure out how to get rid of the old postgres PVC.
I had to delete the statefulset that was attached to it like so:

$ kubectl delete statefulset awx-demo-postgres-13 -n awx

But now I see a different failed message in the controller logs.

$ kubectl logs -f awx-operator-controller-manager-7d79f6f96d-r2gr7 -n awx
....
[localhost]: FAILED! => {\"changed\": true, \"failed_when_result\": true, \"rc\": 1, \"return_code\": 1, \"stderr\": \"pg_restore: error: connection to database \\\"awx\\\" failed: connection to server at \\\"awx-demo-postgres-13.awx.svc.cluster.local\\\" (10.244.0.9), port 5432 failed: FATAL:  password authentication failed for user \\\"awx\\\"\\n\", \"stderr_lines\": [\"pg_restore: error: connection to database \\\"awx\\\" failed: connection to server at \\\"awx-demo-postgres-13.awx.svc.cluster.local\\\" (10.244.0.9), port 5432 failed: FATAL:  password authentication failed for user \\\"awx\\\"\"], \"stdout\": \"\", \"stdout_lines\": []}\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost                  : ok=47   changed=10   unreachable=0    failed=1    skipped=13   rescued=0    ignored=0   \r\n\n","job":"3619264603841225764","name":"restore1","namespace":"awx","error":"exit status 2"

Looks like ansible is not able to connect to the postgres DB.  How can I fix that? 

Michael Kelly

unread,
Apr 21, 2023, 11:05:44 AM4/21/23
to awx-p...@googlegroups.com
What does kubectl -n awx get statefulset return?

RedCrick

unread,
Apr 21, 2023, 11:09:40 AM4/21/23
to AWX Project
bpadmin@minikube:~$ kubectl -n awx get statefulset NAME                   READY   AGE awx-demo-postgres-13   1/1     23h

Michael Kelly

unread,
Apr 21, 2023, 11:46:59 AM4/21/23
to awx-p...@googlegroups.com
Can you connect to the container by doing
kubectl -n awx exec -it awx-demo-postgres-13 -c postgres -- bash

RedCrick

unread,
Apr 21, 2023, 11:50:17 AM4/21/23
to AWX Project
Not awx-demo-postgres-13 ... but I can exec onto awx-demo-postgres-13-0.

admin@minikube:~$ kubectl -n awx exec -it awx-demo-postgres-13-0 -c postgres -- bash root@awx-demo-postgres-13-0:/#

RedCrick

unread,
Apr 21, 2023, 11:59:45 AM4/21/23
to AWX Project
And I can get into the db shell ...

root@awx-demo-postgres-13-0:/# psql -U awx psql (13.10 (Debian 13.10-1.pgdg110+1)) Type "help" for help. awx=#

Michael Kelly

unread,
Apr 21, 2023, 12:02:25 PM4/21/23
to awx-p...@googlegroups.com
My bad.
Can you connect to the database by doing
psql -d awx -U <DB_USERNAME> -W


RedCrick

unread,
Apr 21, 2023, 12:11:16 PM4/21/23
to AWX Project
Yep I can see these databases ...

root@awx-demo-postgres-13-0:/# psql -U awx psql (13.10 (Debian 13.10-1.pgdg110+1)) Type "help" for help. awx=# \l                              List of databases    Name    | Owner | Encoding |  Collate   |   Ctype    | Access privileges -----------+-------+----------+------------+------------+-------------------  awx       | awx   | UTF8     | en_US.utf8 | en_US.utf8 |  postgres  | awx   | UTF8     | en_US.utf8 | en_US.utf8 |  template0 | awx   | UTF8     | en_US.utf8 | en_US.utf8 | =c/awx           +            |       |          |            |            | awx=CTc/awx  template1 | awx   | UTF8     | en_US.utf8 | en_US.utf8 | =c/awx           +            |       |          |            |            | awx=CTc/awx (4 rows)

Michael Kelly

unread,
Apr 21, 2023, 12:13:32 PM4/21/23
to awx-p...@googlegroups.com
I just noticed that the error output below includes
failed: connection to server at \\\"awx-demo-postgres-13.awx.svc.cluster.local\\\" (10.244.0.9), port 5432 failed: FATAL
There's no '-0' in it, I'm not sure if that is relevant or not.

RedCrick

unread,
Apr 21, 2023, 12:56:45 PM4/21/23
to AWX Project
hmm ...

admin@minikube:~$ kubectl get all -A | grep awx-demo-postgres-13 awx                    pod/awx-demo-postgres-13-0                             1/1     Running   0               25h awx                    service/awx-demo-postgres-13                              ClusterIP   None            <none>        5432/TCP                 2d15h awx         statefulset.apps/awx-demo-postgres-13   1/1     25h

I wonder if I should try and change the name of one of those resources.

Michael Kelly

unread,
Apr 21, 2023, 1:35:16 PM4/21/23
to awx-p...@googlegroups.com
Try this
kubectl -n awx exec -it <awx pod> -c awx-task -- bash
In the container shell
psql -h awx-demo-postgres-13.awx.svc.cluster.local -d awx -U <DB_USER> -W
Is the connection established

RedCrick

unread,
Apr 21, 2023, 1:59:33 PM4/21/23
to AWX Project
I think it is making a connection ... but since I don't know the password. I just see this ...

root@awx-demo-postgres-13-0:/# psql -U awx -h awx-demo-postgres-13.awx.svc.cluster.local -d awx -W Password: psql: error: connection to server at "awx-demo-postgres-13.awx.svc.cluster.local" (10.244.0.9), port 5432 failed: FATAL:  password authentication failed for user "awx"

Michael Kelly

unread,
Apr 21, 2023, 2:38:19 PM4/21/23
to awx-p...@googlegroups.com
Excellent! This is a password issue.
kubectl -n awx get secrets
Hopefully the output will include something that refers to postgres.
If the output includes a secret called awx-demo-postgres-configuration then we can extract the password.
What you can do is:
kubectl -n awx get secret/awx-demo-postgres-configuration -o yaml
The password in the output is just the base64 encoded of the db password.

RedCrick

unread,
Apr 21, 2023, 2:54:21 PM4/21/23
to AWX Project
ok thanks! I got the password updated now.

root@awx-demo-postgres-13-0:/# psql -U awx -h awx-demo-postgres-13.awx.svc.cluster.local -d awx -W Password: psql: error: connection to server at "awx-demo-postgres-13.awx.svc.cluster.local" (10.244.0.9), port 5432 failed: FATAL:  password authentication failed for user "awx" root@awx-demo-postgres-13-0:/# psql -U awx psql (13.10 (Debian 13.10-1.pgdg110+1)) Type "help" for help. awx=# \password awx Enter new password for user "awx": Enter it again: awx=# quit root@awx-demo-postgres-13-0:/# psql -U awx -h awx-demo-postgres-13.awx.svc.cluster.local -d awx -W Password: psql (13.10 (Debian 13.10-1.pgdg110+1)) Type "help" for help. awx=#

RedCrick

unread,
Apr 21, 2023, 3:15:21 PM4/21/23
to AWX Project
That did it! I have restored by backup from one AWX server and moved it to a minikube cluster running awx-operator.
Thanks for all the help!  now I just need to write up a step by step procedure.

Reply all
Reply to author
Forward
0 new messages