Database instance not starting

211 views

Skip to first unread message

Yuvraj Vansure

unread,

Aug 17, 2023, 8:21:01 AM8/17/23

to Postgres Operator

Postgres cluster created using PostgresCluster CR becomes unhealthy after worker nodes were drained and restarted.
We are seeing following logs in database Instance Pod

2023-07-27 17:01:58,052 INFO: No PostgreSQL configuration items changed, nothing to reload.
2023-07-27 17:01:58,072 WARNING: Postgresql is not running.
2023-07-27 17:01:58,073 INFO: Lock owner: None; I am instrumentationdb-instance1-zttf-0
2023-07-27 17:01:58,077 INFO: pg_controldata:
pg_control version number: 1300
Catalog version number: 202007201
Database system identifier: 7209530190925107282
Database cluster state: shut down in recovery
pg_control last modified: Thu Jul 27 16:31:44 2023
Latest checkpoint location: 15/2A000028
Latest checkpoint's REDO location: 15/2A000028
Latest checkpoint's REDO WAL file: 0000007D000000150000002A
Latest checkpoint's TimeLineID: 125
Latest checkpoint's PrevTimeLineID: 125
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:45422
Latest checkpoint's NextOID: 188416
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 478
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Thu Jul 13 19:18:39 2023
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 15/2A0000A0
Min recovery ending loc's timeline: 125
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: on
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: afec4ac2d2d78c649caa0234cf9eaa6be0c85273f288b81884d878cbf295f8d8

2023-07-27 17:01:58,092 INFO: Lock owner: None; I am instrumentationdb-instance1-zttf-0
2023-07-27 17:01:58,252 INFO: starting as a secondary
2023-07-27 17:01:58,481 INFO: postmaster pid=102
/tmp/postgres:5432 - no response
2023-07-27 17:01:58.495 UTC [102] LOG: pgaudit extension initialized
2023-07-27 17:01:58.516 UTC [102] LOG: redirecting log output to logging collector process
2023-07-27 17:01:58.516 UTC [102] HINT: Future log output will appear in directory "log".
/tmp/postgres:5432 - accepting connections
/tmp/postgres:5432 - accepting connections
2023-07-27 17:01:59,589 INFO: establishing a new patroni connection to the postgres cluster
2023-07-27 17:01:59,667 INFO: My wal position exceeds maximum replication lag
2023-07-27 17:01:59,788 INFO: following a different leader because i am not the healthiest node
2023-07-27 17:02:10,090 INFO: My wal position exceeds maximum replication lag
2023-07-27 17:02:10,099 INFO: following a different leader because i am not the healthiest node
2023-07-27 17:02:20,090 INFO: My wal position exceeds maximum replication lag
2023-07-27 17:02:20,100 INFO: following a different leader because i am not the healthiest node
2023-07-27 17:02:30,090 INFO: My wal position exceeds maximum replication lag

PostgresCluster CR
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: instrumentationdb
namespace: ibm-common-services
finalizers:
- postgres-operator.crunchydata.com/finalizer
spec:
backups:
pgbackrest:
image: >-
registry.connect.redhat.com/crunchydata/crunchy-pgbackrest@sha256:efe775d3208befb2b7f026ef5fee3b03b306a9ba773709ec5c4c3391880ee60b
repoHost: {}
repos:
- name: repo1
schedules:
incremental: '@daily'
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
storageClassName: ocs-storagecluster-ceph-rbd
restore:
enabled: true
options:
- '--type=time'
- '--target="2023-08-10 04:59:02"'
repoName: repo1
image: >-
registry.connect.redhat.com/crunchydata/crunchy-postgres@sha256:6b570ee2922281eedc5c267c50ad30a895fbb4e8a132c3e2c3a38e29fe3d6f6a
instances:
- dataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
storageClassName: ocs-storagecluster-ceph-rbd
name: instance1
replicas: 1
openshift: true
port: 5432
postgresVersion: 13
users:
- databases:
- instrumentationdb
name: udsdbuser
- name: postgres
status:
observedGeneration: 21677
usersRevision: '858744695'
monitoring:
exporterConfiguration: 559c4c97d6
proxy:
pgBouncer:
postgresRevision: 5c9966f6bc
pgbackrest:
repoHost:
apiVersion: apps/v1
kind: StatefulSet
ready: true
repos:
- bound: true
name: repo1
replicaCreateBackupComplete: true
stanzaCreated: true
volume: pvc-ddf1d026-6f33-4d57-bb29-c4ee07cce5aa
scheduledBackups:
- completionTime: '2023-07-11T00:00:11Z'
cronJobName: instrumentationdb-repo1-incr
repo: repo1
startTime: '2023-07-11T00:00:00Z'
succeeded: 1
type: incr
- completionTime: '2023-07-12T00:00:10Z'
cronJobName: instrumentationdb-repo1-incr
repo: repo1
startTime: '2023-07-12T00:00:00Z'
succeeded: 1
type: incr
- completionTime: '2023-07-13T00:00:10Z'
cronJobName: instrumentationdb-repo1-incr
repo: repo1
startTime: '2023-07-13T00:00:00Z'
succeeded: 1
type: incr
- cronJobName: instrumentationdb-repo1-incr
failed: 7
repo: repo1
startTime: '2023-08-10T00:00:00Z'
type: incr
databaseRevision: 55c88b8fdb
conditions:
- lastTransitionTime: '2023-08-01T08:34:05Z'
message: pgBackRest dedicated repository host is ready
observedGeneration: 21677
reason: RepoHostReady
status: 'True'
type: PGBackRestRepoHostReady
- lastTransitionTime: '2023-03-12T05:39:26Z'
message: pgBackRest replica create repo is ready for backups
observedGeneration: 21677
reason: StanzaCreated
status: 'True'
type: PGBackRestReplicaRepoReady
- lastTransitionTime: '2023-03-12T05:40:40Z'
message: pgBackRest replica creation is now possible
observedGeneration: 21677
reason: RepoBackupComplete
status: 'True'
type: PGBackRestReplicaCreate
patroni:
systemIdentifier: '7209530190925107282'
instances:
- name: instance1
readyReplicas: 1
replicas: 1
updatedReplicas: 1

Environment

Please provide the following details:

Platform: OpenShift
Platform Version: 4.12
Postgres Version 13
Operator Version: postgresoperator.v5.4.1
Storage: ocs-storagecluster-ceph-rbd

Reply all

Reply to author

Forward

0 new messages