Barman & repmgrm archive command and streaming wal

1,061 views
Skip to first unread message

Stefano Tranquillini

unread,
Nov 20, 2020, 3:55:45 AM11/20/20
to Barman, Backup and Recovery Manager for PostgreSQL

Hello all,
Not sure if this is a barman or repmgr problem, i since i'm facing the problem on barman side, i guess it goes here.

I'm having some trouble in configuring correctly barman for a cluster managed by repmgr.

in Repmgr conf https://repmgr.org/docs/repmgr.html  its written to use /bin/true for archive command
# Set archive command to a dummy command; this can later be changed without # needing to restart the PostgreSQL instance. # # See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-COMMAND archive_command = '/bin/true'
but with this (from postgres doc)
Setting archive_command to a command that does nothing but return true, e.g., /bin/true (REM on Windows), effectively disables archiving, but also breaks the chain of WAL files needed for archive recovery, so it should only be used in unusual circumstances.

in fact in BARMAN i've one backup procedure that says
Backup completed (start time: 2020-11-20 08:43:15.984830, elapsed time: 57 seconds)
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'

If i run the command with "--wait"
Waiting for the WAL file 000000170000000200000065 from server 'postgres2.test.dc'
Another archive-wal process is already running on server postgres2.test.dc. Skipping to the next server

and hangs there.

should I enable the archive command or move it to "" so the server keeps it?
or when will barman receive the wal and mark the backup as done?

Another problem that I encountered is an error in the wal streaming.
when the postgres node goes down and I rejoin the REPMGR cluster using repmgr -h {master_node} -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F

Is it possible that then on the barman server i've to do a reset of of the log
barman receive-wal --reset postgres2.test.dc

probably the history or something breaks since i do a rejoin, what's the correct approch here?
because i discovered after days that the barman was not working due to that.

Jason Boyer

unread,
Nov 20, 2020, 9:06:49 AM11/20/20
to pgba...@googlegroups.com
Hi Stefano, that recommendation from the repmgr docs are because turning archiving on requires restarting Postgres and you may want to enable archiving later (such as when using barman). It’s a preventative measure to avoid pain in the future: https://repmgr.org/docs/repmgr.html#CONFIGURATION : "We suggest setting archive_mode to on (and archive_command to /bin/true; see below) even if you are **currently not planning** to use WAL file archiving.” - emphasis mine.

That suggestion is so that if you do enable a “real” archive_command you don’t have to restart Postgres, just a reload. Barman does require a real archive_command to send the WAL to the barman server so the backups it creates can be correctly restored.

If barman is installed locally on the Postgres machine (this is not necessary) it offers a sample script you can use: http://docs.pgbarman.org/release/2.12/barman-wal-archive.1.html but you can use something as simple as 'rsync -a %p barman@YOUR_BARMAN_SERVER:/var/lib/barman/YOUR_PG_SERVER/incoming/%f’ as an archive_command and barman and repmgr will get along just fine. You will need to make sure that the postgres user can ssh to the barman server as the barman user. (I believe this is part of the standard barman setup already.)

You don’t want to set archive_command to “” because the WAL still won’t be sent to the barman server.

Jason

-- 
Jason Boyer
Senior System Administrator
Equinox Open Library Initiative
email:  JBo...@EquinoxInitiative.org
web:  https://EquinoxInitiative.org/

--
--
You received this message because you are subscribed to the "Barman for PostgreSQL" group.
To post to this group, send email to pgba...@googlegroups.com
To unsubscribe from this group, send email to
pgbarman+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/pgbarman?hl=en?hl=en-GB

---
You received this message because you are subscribed to the Google Groups "Barman, Backup and Recovery Manager for PostgreSQL" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pgbarman+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/pgbarman/543b80c5-e941-4d89-80c0-7da53df846fan%40googlegroups.com.

Stefano Tranquillini

unread,
Nov 23, 2020, 4:19:41 AM11/23/20
to Barman, Backup and Recovery Manager for PostgreSQL
But, for barman, a real archive is needed? i'm shipping the wal using streaming (as in the conf example) should I also have an archive command?
in that case, pgpool can stay to /bin/true?
will it work?

Jason Boyer

unread,
Nov 23, 2020, 7:35:18 AM11/23/20
to pgba...@googlegroups.com
You’re not specifically required to use an archive command but the WAL does have to get to the barman server somehow or the backups get stuck in that WAITING_FOR_WALS state. This is done by using an archive_cmmand or setting up barman to use streaming replication (there is a third option but I would recommend using one of those two).

Did you set up barman to use streaming replication also or is that only between your repmgr secondaries? If you have not set up barman to use replication there’s a sample config here: http://docs.pgbarman.org/release/2.12/#examples-of-configuration and there’s a little more detail here: http://docs.pgbarman.org/release/2.12/#archiving-features If you have set that up you don’t need to also setup an archive_command but you may need to try some sample connections following the examples in the barman documentation to see if there’s an issue connecting. 

You may need to increase the max_wal_senders value in postgresql.conf if it’s too low.

Jason

-- 
Jason Boyer
Senior System Administrator
Equinox Open Library Initiative
email:  JBo...@EquinoxInitiative.org
web:  https://EquinoxInitiative.org/

Stefano Tranquillini

unread,
Dec 3, 2020, 3:58:57 AM12/3/20
to Barman, Backup and Recovery Manager for PostgreSQL
The barman is in streaming, apparently the waiting for wal disappears after a while, not sure when and not always.
but at some point the backup is done.
probably it's for the "slaves" that the waiting for wall is longer.
Since i've a master and two slaves, and i backup all of them, the slaves seems to suffer this problem. altought it's not always the case.

no idea, but the barman check tells that everything is fine.
Reply all
Reply to author
Forward
0 new messages