psql: error: could not connect to server: Connection refused

152 views
Skip to first unread message

Shahid Hussain

unread,
Apr 28, 2023, 6:25:49 AM4/28/23
to Postgres Operator
Hello Everyone,
We  are using v5.3.0 version of PGO. I used helm way of installation.
Our setup was functioning properly for a while, but we recently received reports from clients regarding connectivity issues with Postgres. To investigate, I accessed the pgdb-instance pod and attempted to use the psql command, but received the same error message:

bash-4.4$ psql
psql: error: could not connect to server: Connection refused
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/postgres/.s.PGSQL.5432"?

Also I am unable to access directories under /pgdata/
bash-4.4$ cd /pgdata
bash-4.4$ du -sh
du: cannot read directory './pg13': Permission denied
du: cannot read directory './pg13_wal': Permission denied

Also confirmed that enough space available for pgdata directory.
bash-4.4$ df -h|grep pgdata
10.213.9.248:/srv/nfs/kubedata/sfe-eva-2-pgdb-instance1-rk7b-pgdata-pvc-66063a3d-67e8-4f98-89ae-95233881915b   67G  3.8G   60G   6% /pgdata
bash-4.4$

and lastly found that postgres process are not running too.
bash-4.4$ ps -efm |grep postgr
postgres      42       0  0 Apr27 ?        00:04:27 /usr/bin/python3.6 /usr/local/bin/patroni /etc/patroni
postgres       -       -  0 Apr27 -        00:02:45 -
postgres       -       -  0 Apr27 -        00:00:03 -
postgres       -       -  0 Apr27 -        00:00:02 -
postgres       -       -  0 Apr27 -        00:00:00 -
postgres       -       -  0 Apr27 -        00:00:23 -
postgres       -       -  0 Apr27 -        00:00:15 -
postgres      50       0  0 Apr27 ?        00:00:03 replication-cert-copy -ceu monitor
postgres       -       -  0 Apr27 -        00:00:03 -
postgres      60       0  0 Apr27 ?        00:00:08 pgbackrest server
postgres       -       -  0 Apr27 -        00:00:08 -
postgres      69       0  0 Apr27 ?        00:00:05 pgbackrest-config -ceu monitor
postgres       -       -  0 Apr27 -        00:00:05 -
postgres   83582       0  0 03:36 pts/0    00:00:00 bash
postgres       -       -  0 03:36 -        00:00:00 -
postgres   84041   83582  0 03:43 pts/0    00:00:00 ps -efm
postgres       -       -  0 03:43 -        00:00:00 -
postgres   84042   83582  0 03:43 pts/0    00:00:00 grep postgr
postgres       -       -  0 03:43 -        00:00:00 -
bash-4.4$

It appears that Postgres has crashed. Is there a way to verify what could have caused the crash?

Thanks & Regards,
Shahid

Tony Landreth

unread,
May 2, 2023, 9:36:01 AM5/2/23
to Postgres Operator, Shahid Hussain
Hi Shahid,

Sorry to hear about the mysterious crash and connectivity issues.
Do you see anything abnormal in the logs when you "kubectl logs pgdb-instance"?
Can you give provide more information about the cluster, including helm charts?

Thanks,
Tony

Maksym Babenko

unread,
May 5, 2023, 7:03:25 AM5/5/23
to Postgres Operator, Shahid Hussain
can you please provide more details? ) 
for example exec to some of you database pods and run 
patronictl list 
also maybe there are some logs related to your pods? 

Maksym Babenko

unread,
May 9, 2023, 7:50:32 AM5/9/23
to Shahid Hussain, Postgres Operator, tony.l...@crunchydata.com
Hi ) 
Maybe you missed it 


On 9. May 2023, at 10:45, Shahid Hussain <shn...@gmail.com> wrote:

I want to thank Maksym and Tony for their prompt response. I apologize for not replying earlier, as I had to perform a workaround to recover the system and kept it under monitoring. Unfortunately, the issue happened again with fresh installation.

The root cause of the crash of the pgdb instance was due to the permissions of the pgdata directory being changed to root instead of postgress. This caused the pgdb instance to be unable to access the directory and ultimately led to its crash. I manually changed the permission to postgress at the NFS location, and it resolved the issue.
drwxr-xr-x  3 root root 4.0K May  8 15:34 pgbackrest
drwx------  3 root root 4.0K May  9 00:25 pg13_wal
drwx------ 18 root root 4.0K May  9 12:12 pg13
[root@SFE-NFS ~]#


We are currently investigating why the permission of the directory is being assigned as root. As this storage is created by the pod, we don't have explicit control over the permission assignment to the directory. We are using the managed-nfs-storage as the storage class.

@tony
2023-05-09 08:25:56,172 WARNING: Retry got exception: 'connection problems'
/tmp/postgres:5432 - no response
2023-05-09 08:25:56,189 ERROR: Unexpected exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/patroni/ha.py", line 1514, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.6/site-packages/patroni/ha.py", line 1410, in _run_cycle
    if self.state_handler.data_directory_empty():
  File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/__init__.py", line 297, in data_directory_empty
    return data_directory_is_empty(self._data_dir)
  File "/usr/local/lib/python3.6/site-packages/patroni/utils.py", line 481, in data_directory_is_empty
    return all(os.name != 'nt' and (n.startswith('.') or n == 'lost+found') for n in os.listdir(data_dir))
PermissionError: [Errno 13] Permission denied: '/pgdata/pg13'
2023-05-09 08:25:56,190 INFO: Unexpected exception raised, please report it as a BUG
2023-05-09 08:26:05,290 INFO: establishing a new patroni connection to the postgres cluster
2023-05-09 08:26:06,053 INFO: establishing a new patroni connection to the postgres cluster
2023-05-09 08:26:06,054 WARNING: Retry got exception: 'connection problems'
/tmp/postgres:5432 - no response
2023-05-09 08:26:06,070 ERROR: Unexpected exception

@Maksym:
It seems that the Postgres pod has crashed, and I am unable to access PSQL to retrieve the Patroni list.
bash-4.4$ psql
psql: error: could not connect to server: Connection refused
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/postgres/.s.PGSQL.5432"?
bash-4.4$

Shahid Hussain

unread,
May 9, 2023, 7:50:35 AM5/9/23
to Postgres Operator, Maksym Babenko, Shahid Hussain, tony.l...@crunchydata.com
bash-4.4$ psql
psql: error: could not connect to server: Connection refused
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/postgres/.s.PGSQL.5432"?
bash-4.4$

On Friday, 5 May 2023 at 16:33:25 UTC+5:30 Maksym Babenko wrote:

Shahid Hussain

unread,
May 9, 2023, 9:08:53 AM5/9/23
to Postgres Operator, Shahid Hussain, Maksym Babenko, tony.l...@crunchydata.com
Hi All,
Thanks for your responses!.
We have identified the cause of this issue. A module is being installed on our NFS server that modifies the permissions of all storage to root. As a result, Postgres is unable to access the storage later, causing it to crash.
We are solving this from our side.

Reply all
Reply to author
Forward
0 new messages