scheduled full backup error

209 views
Skip to first unread message

Abbas Gori

unread,
Jun 28, 2021, 10:12:05 AM6/28/21
to Postgres Operator
Hi,

We are using the Postgres operator to manage around 10 database clusters. and I have created a full scheduled backup for every day with a retention of 7 days. The operator is configured to take backups to GCS using minio, operator version is 4.6.1. In the past couple of days, the scheduled job pod is failing at every first attempt on all the database clusters.
attached images for better understanding.
qa-backup.png

The logs of the job are.
kubectl logs pgo-qa-full-sch-backup-64txr

Mon Jun 28 01:00:05 UTC 2021 INFO: Image mode found: pgbackrest
Mon Jun 28 01:00:05 UTC 2021 INFO: Starting in 'pgbackrest' mode
time="2021-06-28T01:00:07Z" level=info msg="crunchy-pgbackrest starts"
time="2021-06-28T01:00:07Z" level=info msg="debug flag set to %tfalse"
time="2021-06-28T01:00:07Z" level=info msg="backrest backup command requested"
time="2021-06-28T01:00:07Z" level=info msg="s3 flag enabled for backrest command"
time="2021-06-28T01:00:07Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --type=full --repo1-retention-full=7 --db-host=10.2.3.212 --db-path=/pgdata/pgo-qa --repo1-type=s3 --no-repo1-s3-verify-tls]"
time="2021-06-28T01:09:44Z" level=info msg="output=[]"
time="2021-06-28T01:09:44Z" level=info msg="stderr=[ERROR: [042]: unexpected eof while reading line\n]"
time="2021-06-28T01:09:44Z" level=fatal msg="command terminated with exit code 42"

Abbas Gori

unread,
Jun 29, 2021, 3:56:03 AM6/29/21
to Postgres Operator, Abbas Gori
Is this because all 10 backups are scheduled at the same time? will it resolve if I schedule each backup at a different time?

Jonathan S. Katz

unread,
Jun 29, 2021, 4:04:03 AM6/29/21
to Abbas Gori, Postgres Operator
Yes. pgBackRest presently only allows a single backup to be taken at any given time for a Postgres cluster.

That said, the error seems to indicate something else is occurring at the filesystem level.

I would also recommend looking into upgrading to PGO 4.7, which includes native support for backups being stored in GCS.

Thanks,

Jonathan

Jonathan S. Katz
VP Platform Engineering

Crunchy Data
Enterprise PostgreSQL 


Abbas Gori

unread,
Jun 30, 2021, 6:49:26 AM6/30/21
to Postgres Operator, jonath...@crunchydata.com, Postgres Operator, Abbas Gori

Is there a command to update the existing schedule or I have to delete and recreate it with different timings? because I checked the documents I think there isn't!

Abbas Gori

unread,
Jun 30, 2021, 6:50:02 AM6/30/21
to Postgres Operator, jonath...@crunchydata.com, Postgres Operator, Abbas Gori
I tried scheduling the backups at different times but that did not solve the problem and some scheduled pods are showing different errors.


Wed Jun 30 06:00:02 UTC 2021 INFO: Image mode found: pgbackrest
30/06/2021 11:30:02 Wed Jun 30 06:00:02 UTC 2021 INFO: Starting in 'pgbackrest' mode
30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="crunchy-pgbackrest starts"
30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="debug flag set to %tfalse"
30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="backrest backup command requested"
30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="s3 flag enabled for backrest command"
30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --type=full --repo1-retention-full=7 --db-host=10.2.3.152 --db-path=/pgdata/pgo-recko --repo1-type=s3 --no-repo1-s3-verify-tls]"
30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=info msg="output=[]"
30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=info msg="stderr=[ERROR: [100]: TLS syscall error: [32] Broken pipe\n]"
30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=fatal msg="command terminated with exit code 100"
On Tuesday, June 29, 2021 at 1:34:03 PM UTC+5:30 jonath...@crunchydata.com wrote:
Reply all
Reply to author
Forward
0 new messages