pg_receivewal: requested WAL segment has already been removed after repmgr faiover / switchover

27 views
Skip to first unread message

Mauricio Fernandez

unread,
Apr 14, 2025, 3:05:09 PMApr 14
to Barman, Backup and Recovery Manager for PostgreSQL
Hi community, I've been testing a Postgresql HA environment which include:

Node 1 Primary : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, HAProxy version 2.4.22
Node 2 Standby : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, HAProxy version 2.4.22
Node bkp            : Linux OEL9, Postgresql 17.4, repmgr 5.5.0, barman 3.13.1 Barman by EnterpriseDB


The environment has been working fine (Is a test environment), backup setting is

[barman@pgsqlbkp ~]$ cat /etc/barman.d/moodle01.conf
[moodle01]
active = true
description = "PostgreSQL for Moodle Backup Policy"
conninfo = host=pgsql-vip port=5000 user=barman dbname=postgres
streaming_conninfo = host=pgsql-vip port=5000 user=streaming_barman dbname=replication
backup_directory = /u02/backups/moodle01
backup_method = postgres
streaming_archiver = on
slot_name = barman_pg_moodle01
create_slot = auto
retention_policy = 'RECOVERY WINDOW OF 10 DAYS'

With barman operational, I've simulated a fail over in the primary node and then a switchover to get initially configuration (node 1 primary, node 2 stand by). After this I can't start wal receiver.

[barman@pgsqlbkp ~]$ barman check moodle01
Server moodle01:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: FAILED (slot 'barman_pg_moodle01' not initialised: is 'receive-wal' running?)
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (56.0 MiB)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (16.0 MiB)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 3 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: FAILED (See the Barman log file for more details)
        archiver errors: OK

[barman@pgsqlbkp ~]$ barman cron
Starting WAL archiving for server moodle01
Starting streaming archiver for server moodle01
[barman@pgsqlbkp ~]$ 

The output from the log is

2025-04-14 14:57:46,494 [12193] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 14:57:46,496 [12193] barman.wal_archiver INFO: Activating WAL archiving through streaming protocol
2025-04-14 14:57:46,521 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: starting log streaming at 0/27000000 (timeline 7)
2025-04-14 14:57:46,529 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: error: unexpected termination of replication stream: ERROR:  requested WAL segment 000000080000000000000027 has already been removed
2025-04-14 14:57:46,530 [12193] barman.command_wrappers INFO: moodle01: pg_receivewal: error: disconnected
2025-04-14 14:57:46,531 [12193] barman.server ERROR: ArchiverFailure:pg_receivewal terminated with error code: 1
2025-04-14 14:58:02,023 [12217] barman.utils INFO: Cleaning up lockfiles directory.
2025-04-14 14:58:02,217 [12218] barman.wal_archiver INFO: Found 1 xlog segments from streaming for moodle01. Archive all segments in one run.
2025-04-14 14:58:02,217 [12218] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: moodle01/00000007.history
2025-04-14 14:58:02,313 [12219] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 14:58:02,315 [12219] barman.server ERROR: ArchiverFailure:pg_receivewal not present in $PATH

[barman@pgsqlbkp ~]$ which pg_receivewal
/usr/pgsql-17/bin/pg_receivewal

I can't force a backup eather

 [barman@pgsqlbkp ~]$ barman backup --immediate-checkpoint moodle01
ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check moodle01'

2025-04-14 15:04:02,000 [12363] barman.server ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check moodle01'
2025-04-14 15:04:02,191 [12387] barman.utils INFO: Cleaning up lockfiles directory.
2025-04-14 15:04:02,372 [12388] barman.wal_archiver INFO: No xlog segments found from streaming for moodle01.
2025-04-14 15:04:02,459 [12389] barman.server INFO: Starting receive-wal for server moodle01
2025-04-14 15:04:02,461 [12389] barman.server ERROR: ArchiverFailure:pg_receivewal not present in $PATH
While I conducted the failover/switchover test, postgresql wal_keep_size was 0

I don't know how to proceed, I will appreciate any tips

thank you in advance

kind regards

Mauricio Fernández 


Luca Ferrari

unread,
Apr 15, 2025, 1:18:25 AMApr 15
to pgba...@googlegroups.com
On Mon, Apr 14, 2025 at 9:05 PM Mauricio Fernandez
<mmauricio...@gmail.com> wrote:
>
> 2025-04-14 14:58:02,315 [12219] barman.server ERROR: ArchiverFailure:pg_receivewal not present in $PATH
>
> [barman@pgsqlbkp ~]$ which pg_receivewal
> /usr/pgsql-17/bin/pg_receivewal
>

try putting this in your barman.conf:

path_prefix = /usr/pgsql-17/bin/


Luca

Mauricio Fernandez

unread,
Apr 15, 2025, 8:18:14 AMApr 15
to Barman, Backup and Recovery Manager for PostgreSQL
Thank you very much Luca, the new parameter works fine, I don't have any more the PATH error, but I'im still having error because a losted wal file.

2025-04-15 08:12:01,826 [23478] barman.utils INFO: Cleaning up lockfiles directory.
2025-04-15 08:12:02,023 [23479] barman.wal_archiver INFO: Found 1 xlog segments from streaming for moodle01. Archive all segments in one run.
2025-04-15 08:12:02,023 [23479] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: moodle01/00000007.history
2025-04-15 08:12:02,133 [23480] barman.server INFO: Starting receive-wal for server moodle01
2025-04-15 08:12:02,135 [23480] barman.wal_archiver INFO: Activating WAL archiving through streaming protocol
2025-04-15 08:12:02,162 [23480] barman.command_wrappers INFO: moodle01: pg_receivewal: starting log streaming at 0/27000000 (timeline 7)
2025-04-15 08:12:02,169 [23480] barman.command_wrappers INFO: moodle01: pg_receivewal: error: unexpected termination of replication stream: ERROR:  requested WAL segment 000000080000000000000027 has already been removed
2025-04-15 08:12:02,170 [23480] barman.command_wrappers INFO: moodle01: pg_receivewal: error: disconnected
2025-04-15 08:12:02,172 [23480] barman.server ERROR: ArchiverFailure:pg_receivewal terminated with error code: 1

The HA cluster is working fine. The archive command is archive_command = '/bin/true' because barman is retrieving the wal files. Should I mix (streaming and rsync) adding a command like

archive_command = 'test ! -f $PGBKP/archive/%f && cp %p $PGBKP/archive/%f'        and $PGBK a remote directory in the barman server??        

I've presumed the losted wal file is unrecuperable . Is there any way to reset barman and start "from zero".

Thank you very much in advance

kind regards

Mauricio Fernández




Luca Ferrari

unread,
Apr 15, 2025, 11:59:45 AMApr 15
to pgba...@googlegroups.com
On Tue, Apr 15, 2025 at 2:18 PM Mauricio Fernandez
<mmauricio...@gmail.com> wrote:
>
> The HA cluster is working fine. The archive command is archive_command = '/bin/true' because barman is retrieving the wal files. Should I mix (streaming and rsync) adding a command like

If archive_mode is off, who cares what archive_command is.

> I've presumed the losted wal file is unrecuperable . Is there any way to reset barman and start "from zero".

You probably have to delete the backup and start over from scratch,
but I'm not sure if there is another way to make barman "continue"
combining backups. It is clear that having a missing wal you have now
an "up to the lost wal" backup and a new one from then on.


Luca
Reply all
Reply to author
Forward
0 new messages