3node repmgr cluster & barman on one slave

54 views
Skip to first unread message

Bernhard Hortens

unread,
Nov 8, 2024, 4:12:47 AM11/8/24
to Barman, Backup and Recovery Manager for PostgreSQL
Hello all, 

[OK after typing all this I remembered that there was a debug log setting for barman so I turned it on and then it was obvious what was happening. See below.]

We are running a 3 node repmgr cluster (PG16) and I recently set up barman ( 3.11.1) on one of the designated slaves.

My plan was to set up barman to connect to localhost on the slave, to put less stress on the master (important data is committed synchronously anyway). Backups reside on strongly redundant network filer, and are synched offsite by a different job.

After some confusion about the WAL archive (I have that already set up for repmgr), barman check dbserver flashes all green.

I can start a compressed backup, but it remains (for hours) in the "WAITING_FOR_WALS" state, OK, the db server is not seeing much traffic, yet, but still I would like it to finish within a reasonable amount of time.
[Funny enough, there is still the warning about the WAITING_FOR_WALS status after backup finishes, but when I checked a minute later with show-backup, it finally showed "DONE" - I also had to use the barman user for the primary connection, not the streaming_barman, because the replication role seemed OK for getting the system ID, but not for switching the WALs]

I thought that setting up a primary_conninfo would help based on the docs, but I hit a snag there, barman tells me that the systemid is not the same on the master and the slave (but it is if I request the info via psql as the barman user from both servers, and if I switch the streaming_conninfo to the master, it shows the same in barman diagnose).

Primary and standby have same system ID: FAILED (primary_conninfo and conninfo should point to primary and standby servers which share the same system identifier)

-> this was in the debug log:
barman.postgres DEBUG: Error calling pg_is_in_recovery() function: connection to server at "dbmaster" (10.0.0.1), port 5432 failed: FATAL:  no pg_hba.conf entry for
host "10.0.0.2", user "streaming_barman", database "postgres", SSL encryption


How can I further diagnose what's not working with the primary_conninfo, is that even going to solve my " WAITING_FOR_WALS" problem and is my setup halfway OK at all? I really want to refrain from using a separate server for barman, and pulling the backup from the master db, if possible.

[cluster]
description = "PostgreSQL Cluster"
conninfo = host=localhost user=barman dbname=postgres
streaming_conninfo = host=localhost user=streaming_barman dbname=postgres
primary_conninfo = host=dbmaster user=barman dbname=postgres
slot_name = barman
backup_method = postgres
streaming_archiver = on
wal_retention_policy = 'recovery window of 7 days'
backup_directory = /filer/barman/cluster


Martin Marques

unread,
Mar 17, 2025, 5:31:36 AMMar 17
to pgba...@googlegroups.com
Hi Bernhard,

Sorry for not getting back earlier.

Could you try with a more recent version of Barman? There were some fixes related to how primary_conninfo worked. I'm not sure what you are experiencing is related with those changes, but it would be great to discard that being the case.

--
--
You received this message because you are subscribed to the "Barman for PostgreSQL" group.
To post to this group, send email to pgba...@googlegroups.com
To unsubscribe from this group, send email to
pgbarman+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/pgbarman?hl=en?hl=en-GB

---
You received this message because you are subscribed to the Google Groups "Barman, Backup and Recovery Manager for PostgreSQL" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pgbarman+u...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/pgbarman/715a84f8-8d0e-49b0-9b45-c8125a639e12n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages