Hi community, I've been testing a Postgresql HA environment which include:
Node 1 Primary : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, HAProxy version 2.4.22
Node 2 Standby : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, HAProxy version 2.4.22
Node bkp : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, barman 3.13.1 Barman by EnterpriseDB
The environment has been working fine (Is a test environment), backup setting is
[barman@pgsqlbkp ~]$ cat /etc/barman.d/moodle01.conf
[moodle01]
active = true
description = "PostgreSQL for Moodle Backup Policy"
conninfo = host=pgsql-vip port=5000 user=barman dbname=postgres
streaming_conninfo = host=pgsql-vip port=5000 user=streaming_barman dbname=replication
backup_directory = /u02/backups/moodle01
backup_method = postgres
streaming_archiver = on
slot_name = barman_pg_moodle01
create_slot = auto
retention_policy = 'RECOVERY WINDOW OF 10 DAYS'
With barman operational, I've simulated a fail over in the primary node and then a switchover to get initially configuration (node 1 primary, node 2 stand by). After this I can't start wal receiver.
[barman@pgsqlbkp ~]$ barman check moodle01
Server moodle01:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: FAILED (slot 'barman_pg_moodle01' not initialised: is 'receive-wal' running?)
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
backup minimum size: OK (56.0 MiB)
wal maximum age: OK (no last_wal_maximum_age provided)
wal size: OK (16.0 MiB)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 3 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
systemid coherence: OK
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: FAILED (See the Barman log file for more details)
archiver errors: OK
[barman@pgsqlbkp ~]$ barman cron
Starting WAL archiving for server moodle01
Starting streaming archiver for server moodle01
[barman@pgsqlbkp ~]$
The output from the log is
2025-04-14 14:57:46,494 [12193] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 14:57:46,496 [12193] barman.wal_archiver INFO: Activating WAL archiving through streaming protocol
2025-04-14 14:57:46,521 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: starting log streaming at 0/27000000 (timeline 7)
2025-04-14 14:57:46,529 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: error: unexpected termination of replication stream: ERROR: requested WAL segment 000000080000000000000027 has already been removed
2025-04-14 14:57:46,530 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: error: disconnected
2025-04-14 14:57:46,531 [12193] barman.server ERROR: ArchiverFailure:pg_receivewal terminated with error code: 1
2025-04-14 14:58:02,023 [12217] barman.utils INFO: Cleaning up lockfiles directory.
2025-04-14 14:58:02,217 [12218] barman.wal_archiver INFO: Found 1 xlog segments from streaming for moodle01. Archive all segments in one run.
2025-04-14 14:58:02,217 [12218] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: moodle01/00000007.history
2025-04-14 14:58:02,313 [12219] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 14:58:02,315 [12219] barman.server ERROR: ArchiverFailure:pg_receivewal not present in $PATH
[barman@pgsqlbkp ~]$ which pg_receivewal
/usr/pgsql-17/bin/pg_receivewal
I can't force a backup eather
[barman@pgsqlbkp ~]$ barman backup --immediate-checkpoint moodle01
ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check moodle01'
2025-04-14 15:04:02,000 [12363] barman.server ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check moodle01'
2025-04-14 15:04:02,191 [12387] barman.utils INFO: Cleaning up lockfiles directory.
2025-04-14 15:04:02,372 [12388] barman.wal_archiver INFO: No xlog segments found from streaming for moodle01.
2025-04-14 15:04:02,459 [12389] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 15:04:02,461 [12389] barman.server ERROR: ArchiverFailure:pg_receivewal not present in $PATH
While I conducted the failover/switchover test, postgresql wal_keep_size was 0
I don't know how to proceed, I will appreciate any tips
thank you in advance
kind regards
Mauricio Fernández