Backup patroni cluster with Barman (WAITING_FOR_WALLS on some nodes)

29 views
Skip to first unread message

Stefano T

unread,
Apr 17, 2024, 4:50:22 AMApr 17
to Barman, Backup and Recovery Manager for PostgreSQL

Hello
I have set up a 3 node cluster with patroni.
I have  barman that backups all these three nodes.
if we do a `barman check all` the output is ALL GOOD.
However, daily, we run the barman backup DATABASE and, on SOME nodes, it has the WAITING_FOR_WALLS.

and in the output of the barman backup, for SOME nodes there's this:
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'


( I tried with --wait and the backup never finished.. related to this, i guess I've an every-0.1 seconds log in baman saying that is waiting for the wall - i had to restart the server.
2024-04-17 08:29:40,499 [28098] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dev.dc.
2024-04-17 08:29:40,601 [28098] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dev.dc.
2024-04-17 08:29:40,703 [28098] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dev.dc..
)

the barman cron is working -
2024-04-17 08:44:01,458 [2143] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dc.
2024-04-17 08:44:01,519 [2145] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dev.dc.
2024-04-17 08:44:01,565 [2146] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.test.dc.
2024-04-17 08:44:01,641 [2147] barman.wal_archiver INFO: No xlog segments found from streaming for postgres2.dc.
2024-04-17 08:44:01,700 [2148] barman.wal_archiver INFO: No xlog segments found from streaming for postgres2.test.dc.
2024-04-17 08:44:01,757 [2150] barman.wal_archiver INFO: No xlog segments found from streaming for postgres3.dc.
2024-04-17 08:44:01,981 [2152] barman.wal_archiver INFO: No xlog segments found from streaming for postgres3.test.dc.
2024-04-17 08:44:06,449 [823] barman.command_wrappers INFO: postgres2.test.dc: pg_receivewal: finished segment at 1A0/F000000 (timeline 39)
2024-04-17 08:44:06,449 [859] barman.command_wrappers INFO: postgres3.test.dc: pg_receivewal: finished segment at 1A0/F000000 (timeline 39)
2024-04-17 08:44:06,449 [761] barman.command_wrappers INFO: postgres1.test.dc: pg_receivewal: finished segment at 1A0/F000000 (timeline 39)
2024-04-17 08:45:02,232 [2266] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dc.
2024-04-17 08:45:02,299 [2268] barman.wal_archiver INFO: No xlog segments found from streaming for postgres1.dev.dc.
2024-04-17 08:45:02,357 [2269] barman.wal_archiver INFO: Found 1 xlog segments from streaming for postgres1.test.dc. Archive all segments in one run.
2024-04-17 08:45:02,357 [2269] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: postgres1.test.dc/00000027000001A00000000E
2024-04-17 08:45:02,396 [2270] barman.wal_archiver INFO: No xlog segments found from streaming for postgres2.dc.
2024-04-17 08:45:02,490 [2271] barman.wal_archiver INFO: Found 1 xlog segments from streaming for postgres2.test.dc. Archive all segments in one run.
2024-04-17 08:45:02,490 [2271] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: postgres2.test.dc/00000027000001A00000000E
2024-04-17 08:45:02,577 [2275] barman.wal_archiver INFO: No xlog segments found from streaming for postgres3.dc.
2024-04-17 08:45:03,040 [2305] barman.wal_archiver INFO: Found 1 xlog segments from streaming for postgres3.test.dc. Archive all segments in one run.
2024-04-17 08:45:03,040 [2305] barman.wal_archiver INFO: Archiving segment 1 of 1 from streaming: postgres3.test.dc/00000027000001A00000000E


 what can it be? is there maybe some settings in patroni we have to switch?



Note: we don't use barman in patroni to do restore or other, in case we do it manually, we use it only to back-up the cluster. 

Stefano T

unread,
May 3, 2024, 3:31:26 AMMay 3
to Barman, Backup and Recovery Manager for PostgreSQL
I ended up deleting all the backups that were WAITING_FOR_WALS.
Today all the new backup were done correctly, it may be that there was some problem with old backups and/or some wal stored - we restored the database and changed repmgr to patroni on some of the backups. 

PS:  if someone wants the script to clean all WAITING_FOR_WALS backup here it is: just pass as first parameter the machine or "all"

#!/bin/bash

# Run the command and store the output in a variable
output=$(barman list-backup $1)

# Loop through each line of the output
while IFS= read -r line; do
    # Check if the line ends with "WAITING_FOR_WALS"
    if [[ $line == *WAITING_FOR_WALS ]]; then
        # Extract the first two items from the line
        first_arg=$(echo "$line" | awk '{print $1}')
        second_arg=$(echo "$line" | awk '{print $2}')

        # Call barman delete with the extracted arguments
echo "$first_arg" "$second_arg"
        barman delete "$first_arg" "$second_arg"
    fi
done <<< "$output"
Reply all
Reply to author
Forward
0 new messages