scheduled full backup error

Abbas Gori

unread,

Jun 28, 2021, 10:12:05 AM6/28/21

to Postgres Operator

Hi,

We are using the Postgres operator to manage around 10 database clusters. and I have created a full scheduled backup for every day with a retention of 7 days. The operator is configured to take backups to GCS using minio, operator version is 4.6.1. In the past couple of days, the scheduled job pod is failing at every first attempt on all the database clusters.

attached images for better understanding.

The logs of the job are.
kubectl logs pgo-qa-full-sch-backup-64txr

Mon Jun 28 01:00:05 UTC 2021 INFO: Image mode found: pgbackrest

Mon Jun 28 01:00:05 UTC 2021 INFO: Starting in 'pgbackrest' mode

time="2021-06-28T01:00:07Z" level=info msg="crunchy-pgbackrest starts"

time="2021-06-28T01:00:07Z" level=info msg="debug flag set to %tfalse"

time="2021-06-28T01:00:07Z" level=info msg="backrest backup command requested"

time="2021-06-28T01:00:07Z" level=info msg="s3 flag enabled for backrest command"

time="2021-06-28T01:00:07Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --type=full --repo1-retention-full=7 --db-host=10.2.3.212 --db-path=/pgdata/pgo-qa --repo1-type=s3 --no-repo1-s3-verify-tls]"

time="2021-06-28T01:09:44Z" level=info msg="output=[]"

time="2021-06-28T01:09:44Z" level=info msg="stderr=[ERROR: [042]: unexpected eof while reading line\n]"

time="2021-06-28T01:09:44Z" level=fatal msg="command terminated with exit code 42"

Abbas Gori

unread,

Jun 29, 2021, 3:56:03 AM6/29/21

to Postgres Operator, Abbas Gori

Is this because all 10 backups are scheduled at the same time? will it resolve if I schedule each backup at a different time?

Jonathan S. Katz

unread,

Jun 29, 2021, 4:04:03 AM6/29/21

to Abbas Gori, Postgres Operator

Yes. pgBackRest presently only allows a single backup to be taken at any given time for a Postgres cluster.

That said, the error seems to indicate something else is occurring at the filesystem level.

I would also recommend looking into upgrading to PGO 4.7, which includes native support for backups being stored in GCS.

Thanks,

Jonathan

Jonathan S. Katz

VP Platform Engineering

Crunchy Data
Enterprise PostgreSQL

www.crunchydata.com

Abbas Gori

unread,

Jun 30, 2021, 6:49:26 AM6/30/21

to Postgres Operator, jonath...@crunchydata.com, Postgres Operator, Abbas Gori

Is there a command to update the existing schedule or I have to delete and recreate it with different timings? because I checked the documents I think there isn't!

Abbas Gori

unread,

Jun 30, 2021, 6:50:02 AM6/30/21

to Postgres Operator, jonath...@crunchydata.com, Postgres Operator, Abbas Gori

I tried scheduling the backups at different times but that did not solve the problem and some scheduled pods are showing different errors.

Wed Jun 30 06:00:02 UTC 2021 INFO: Image mode found: pgbackrest

30/06/2021 11:30:02 Wed Jun 30 06:00:02 UTC 2021 INFO: Starting in 'pgbackrest' mode

30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="crunchy-pgbackrest starts"

30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="debug flag set to %tfalse"

30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="backrest backup command requested"

30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="s3 flag enabled for backrest command"

30/06/2021 11:30:03 time="2021-06-30T06:00:03Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --type=full --repo1-retention-full=7 --db-host=10.2.3.152 --db-path=/pgdata/pgo-recko --repo1-type=s3 --no-repo1-s3-verify-tls]"

30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=info msg="output=[]"

30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=info msg="stderr=[ERROR: [100]: TLS syscall error: [32] Broken pipe\n]"

30/06/2021 11:39:55 time="2021-06-30T06:09:55Z" level=fatal msg="command terminated with exit code 100"

On Tuesday, June 29, 2021 at 1:34:03 PM UTC+5:30 jonath...@crunchydata.com wrote:

Reply all

Reply to author

Forward