Hello all,
Not sure if this is a barman or repmgr problem, i since i'm facing the problem on barman side, i guess it goes here.
I'm having some trouble in configuring correctly barman for a cluster managed by repmgr.
but with this (from postgres doc)
Setting archive_command to a command that does nothing but return true, e.g., /bin/true (REM
on Windows), effectively disables archiving, but also breaks the chain
of WAL files needed for archive recovery, so it should only be used in
unusual circumstances.
in fact in BARMAN i've one backup procedure that says
Backup completed (start time: 2020-11-20 08:43:15.984830, elapsed time: 57 seconds)
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
If i run the command with "--wait"
Waiting for the WAL file 000000170000000200000065 from server 'postgres2.test.dc'
Another archive-wal process is already running on server postgres2.test.dc. Skipping to the next server
and hangs there.
should I enable the archive command or move it to "" so the server keeps it?
or when will barman receive the wal and mark the backup as done?
Another problem that I encountered is an error in the wal streaming.
when the postgres node goes down and I rejoin the REPMGR cluster using repmgr -h {master_node} -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F
Is it possible that then on the barman server i've to do a reset of of the log
barman receive-wal --reset postgres2.test.dc
probably the history or something breaks since i do a rejoin, what's the correct approch here?
because i discovered after days that the barman was not working due to that.