In the following attempt, I communicate directly with the API (without pgo.client). For this, I have recorded the communication of the pgo client during the restore command using wireshark and taken over the necessary parameters.
The following data was used:
URL: https://IP_ADRESSE/restore
Data:
{"FromCluster":"pgo-benchmark-test-exp","PITRTarget":"2021-03-22 10:29:08.000000+01","RestoreOpts":"--type=time --set=20210322-092855F","BackrestStorageType":"","ClientVersion":"4.6.1","Namespace":"exp"}
Hint: In case the BackrestStorageType field iritates, I have emulated the pgo-client according to Wireshark. Only fields like nodelabel and etc., which were also empty, were left out. Tests without the BackrestStorageType field return the same error.
-> Show Backup:
cluster: pgo-benchmark-test-exp
storage type: posix
stanza: db
status: ok
cipher: none
The Bootstrap Pod goes into error state shortly after start-up and shuts down. The following Pod runs permanently with the familiar error message:
�[0;32mThu Mar 25 09:01:06 UTC 2021 INFO: Correct the issue, remove '/pgdata/pgo-benchmark-test-exp.initializing', and try again�[0m
�[0;32mThu Mar 25 09:01:06 UTC 2021 INFO: Your data might be in: /pgdata/pgo-benchmark-test-exp_2021-03-25-08-54-00�[0m
�[0;33mThu Mar 25 09:01:16 UTC 2021 WARN: Detected an earlier failed attempt to initialize�[0m
�[0;32mThu Mar 25 09:01:06 UTC 2021 INFO: Correct the issue, remove '/pgdata/pgo-benchmark-test-exp.initializing', and try again�[0m
�[0;32mThu Mar 25 09:01:06 UTC 2021 INFO: Your data might be in: /pgdata/pgo-benchmark-test-exp_2021-03-25-08-54-00�[0m
�[0;33mThu Mar 25 09:01:16 UTC 2021 WARN: Detected an earlier failed attempt to initialize�[0m
Hi,
thank you very much for your answer.
Yes, ending this state is possible, but mostly only by following steps:
1. remove the backup folder on the pod (which is mentioned in the error message).
2. pgo-client: pgo restore xyz
3. remove the intialize file and remove the pod
The new pod will then perform the restore, but unfortunately only the standard restore without time and snapshot details.
The problem is that there must be a problem passing the data (snaphost and time) to the api server. The ? in the path speaks for me for faulty data but I can't find it in the json which is sent to the ApiServer. ( pgdata/pgo-benchmark-test-exp_2021-03-25-08-54-00�[0m )
As I said, it is not about the pgo-client (cli) but about the direct communication with the apiserver (curl).
The JSON used:
{"FromCluster":"pgo-benchmark-test-exp","PITRTarget":"2021-03-22 10:29:08.000000+01","RestoreOpts":"--type=time --set=20210322-092855F","BackrestStorageType":"","ClientVersion":"4.6.1","Namespace":"exp"}
Hi,
Thank you very much for the advice. I have explicitly set the encoding again in the communication and we are one step further :)Unfortunately, an error remains. This now occurs both via API and via pgo cli-client.At the first restore attempt, the message appears in the log:2021-05-01 08:36:49,061 INFO: Running custom bootstrap script: /opt/crunchy/bin/postgres-ha/pgbackrest/pgbackrest-create-replica.sh primary[0;32mSat May 1 08:36:49 UTC 2021 INFO: Valid PGDATA dir found for primary, a delta restore will be peformed[0m ERROR: [038]: unable to restore while PostgreSQL is running HINT: presence of 'postmaster.pid' in '/pgdata/pgo-restore-exp-rmgv' indicates PostgreSQL is running. HINT: remove 'postmaster.pid' only if PostgreSQL is not running.[0;31mSat May 1 08:36:49 UTC 2021 ERROR: pgBackRest primary Creation: pgBackRest restore failed when creating primary[0m 2021-05-01 08:36:49,096 INFO: removing initialize key after failed attempt to bootstrap the cluster 2021-05-01 08:36:49,144 INFO: renaming data directory to /pgdata/pgo-restore-exp-rmgv_2021-05-01-08-36-49The error message is clear and distinct, but I have not yet been able to find a solution or workaround.-> Shutting down the primary pod beforehand (deployment = 0) does nothing (waiting time 5-30 min.) The error always comes back at the first attempt.
To solve this, I have to start the bootstrap-running pod (remove all inside /pgdata/) and the restore again.Is there a solution or workaround so that the restore works directly?
--
You received this message because you are subscribed to the Google Groups "Postgres Operator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to postgres-opera...@crunchydata.com.
unfortunately i still have the problem that a restore via pgo-client as well as directly via api crashes with the error message:
[0m ERROR: [038]: unable to restore while PostgreSQL is running
This isn't ideal, it's always easier to use the tools but you can always use the postgres tools if everything fails, it might be worth giving it a go.kubectl -n pgo port-forward svc/dashboard 5432:5432pg_dump -h localhost -U postgres -f prod.sql -p 5433 dashboard ##backuppsql -h localhost -W -U postgres -f prod.sql dashboard ## restore
Of course if you can't take a backup of the data and need to restore this doesn't help you unless you have an existing dataset you can already use or a pgdump available.You can also try this pattern using your existing database snapshot.pgo create cluster -n pgo -u dbadmin --password SECRET --password-superuser SECRET --pvc-size=20Gi --pgbackrest-pvc-size=100Gi dashboard-prod-20-100 --restore-from="dashboard" --restore-opts "--type=time --target='2021-05-04 10:11:30-07'"In my example above, my current cluster is named dashboard and I'm creating a new one called dashboard-prod-20-100. The DNS name would be using dashboard-prod-20-100 instead of my previous version named dashboard. The database though is still going to be called dashboard.This will create a new cluster from your dataset and then you'll simply need to update the references, likely a K8s secret/configmap in order for the app to pick up the changes.