postgres-operator stanza-create job failing | pgbackrest not able to connect to postgres db

1,121 views
Skip to first unread message

Vijender Kumar

unread,
Apr 16, 2021, 7:20:48 AM4/16/21
to Postgres Operator
Hi,
We have a postgres operator 4.6.2 installed by helm version 4.6.2
The postgres-operator stanza-create job fails. with following error message

Fri Apr 16 06:43:28 UTC 2021 INFO: Image mode found: pgbackrest Fri Apr 16 06:43:28 UTC 2021 INFO: Starting in 'pgbackrest' mode time="2021-04-16T06:43:28Z" level=info msg="crunchy-pgbackrest starts" time="2021-04-16T06:43:28Z" level=info msg="debug flag set to %tfalse" time="2021-04-16T06:43:28Z" level=info msg="backrest stanza-create command requested" time="2021-04-16T06:43:28Z" level=info msg="command to execute is [pgbackrest stanza-create --db-host=some-db-host --db-path=/pgdata/some-db-name]" time="2021-04-16T06:43:28Z" level=info msg="output=[]" time="2021-04-16T06:43:28Z" level=info msg="stderr=[WARN: unable to check pg-1: [UnknownError] remote-0 process on 'some-db-host' terminated unexpectedly [255]: postgres@some-db-host: Permission denied (publickey,keyboard-interactive).\nERROR: [056]: unable to find primary cluster - cannot proceed\n]" time="2021-04-16T06:43:28Z" level=fatal msg="command terminated with exit code 56"


  • Operating System: Linux based node
  • Where is this running ( Local, Cloud Provider): GCP
  • Storage being used (NFS, Hostpath, Gluster, etc): GCE
  • Container Image Tag: centos7-4.4.0
  • PostgreSQL Version: 12.6
  • Platform (Docker, Kubernetes, OpenShift): OpenShift
  • Platform Version: 1.17

Jonathan S. Katz

unread,
Apr 16, 2021, 8:04:32 AM4/16/21
to Vijender Kumar, Postgres Operator
Hi,

There was a bug in upgrading to 4.6.2 with a committed fix for future release. The commit for this is here:


This issue provides a remediation step:


Thanks,

Jonathan

Jonathan S. Katz
VP Platform Engineering

Crunchy Data
Enterprise PostgreSQL 


Jonathan S. Katz

unread,
Apr 17, 2021, 9:43:26 AM4/17/21
to Vijender Kumar, Postgres Operator
Hi,


Per:


Is this the setting in the pgBackRest repo Secret for a specific cluster? i.e. "$CLUSTERNAME-backrest-repo-config"?

Is this from upgrading or a new cluster?

Do you have any Pod Security Policies or SCCs in place?

Thanks,

Jonathan

Jonathan S. Katz
VP Platform Engineering

Crunchy Data
Enterprise PostgreSQL 


Jonathan S. Katz

unread,
Apr 17, 2021, 12:33:24 PM4/17/21
to Vijender Kumar, Postgres Operator
Hi Vijender,

Thanks.

4.6.2 contains a change to some of the "securityContext" that can be set on the Pod. I am wondering if any of these are having an affect, though the one conflict that I know (i.e. "UsePAM yes" does not mesh with "allowPrivilegeEscalation: false") can be ruled out here...I think.

A few more questions:

- Which container runtime are you using, e.g. Docker, cri-o etc? There was an issue with the RHEL Docker package prior to docker-1.13.1-161, which is why we had originally had UsePAM set to yes.


Later packages work fine. I wonder if you are hitting this?

- Perhaps a few more steps to try:

-- On the Postgres & pgBackRest Deployments, try removing the "allowPrivilegeEscalation" securityContexts. Note that we don't allow for things to run as root anyway, but sshd may cause a privilege escalation behavior.

Jonathan

Jonathan S. Katz
VP Platform Engineering

Crunchy Data
Enterprise PostgreSQL 



On Sat, Apr 17, 2021 at 12:00 PM Vijender Kumar <vijende...@optiva.com> wrote:
Hi Jonathan,
Yes this setting is from the file sshd_config file present in my clusters database container.
This is from a new cluster
About Pod Security Policies or SCCs: No we do not have any policies in place.

Thanks 

Reply all
Reply to author
Forward
0 new messages