Can't manage to get Barman working :( need help

460 views
Skip to first unread message

Valentin Protiuc

unread,
Aug 11, 2023, 2:58:21 PM8/11/23
to Barman, Backup and Recovery Manager for PostgreSQL

Hi all!

I am trying to make a PoC with Barman and after many many hours strugling with this (perhaps also because of the setup that I need), I managed to get over most of the issues but now I'm completely stuck.

My setup is:

After installing chars I have following pods: 1 pod for barman, 1 pod for postgresql-ha-pgpool, 3 pods for postgresql-ha-postgresql (1 primary and 2 standby).

I do the commands: barman switch-xlog --force --archive all, barman cron, sometimes also running barman receive-wal –reset pg, tring multiple times, checking, try again and finally when doing a check I get that everything is OK, like this:

barman@barman-795d88d965-7l7nh:/var/lib/barman$ barman check pg

Server pg:

    PostgreSQL: OK

    superuser or standard user with backup privileges: OK

    PostgreSQL streaming: OK

    wal_level: OK

    replication slot: OK

    directories: OK

    retention policy settings: OK

2023-08-11 18:36:09,717 [619] barman.server ERROR: Check 'backup maximum age' failed for server 'pg'

    backup maximum age: FAILED (interval provided: 1 day, latest backup age: No available backups)

    backup minimum size: OK (0 B)

    wal maximum age: OK (no last_wal_maximum_age provided)

    wal size: OK (0 B)

    compression settings: OK

    failed backups: OK (there are 0 failed backups)

2023-08-11 18:36:09,721 [619] barman.server ERROR: Check 'minimum redundancy requirements' failed for server 'pg'

    minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)

    pg_basebackup: OK

    pg_basebackup compatible: OK

    pg_basebackup supports tablespaces mapping: OK

    systemid coherence: OK (no system Id stored on disk)

    pg_receivexlog: OK

    pg_receivexlog compatible: OK

    receive-wal running: OK

    archiver errors: OK

But when I do barman backup pg I get error:

barman@barman-795d88d965-7l7nh:/var/lib/barman$ barman backup pg

2023-08-11 18:37:15,937 [639] barman.server ERROR: Check 'replication slot' failed for server 'pg'

2023-08-11 18:37:15,940 [639] barman.server INFO: Ignoring failed check 'backup maximum age' for server 'pg'

2023-08-11 18:37:15,941 [639] barman.server INFO: Ignoring failed check 'minimum redundancy requirements' for server 'pg'

ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check pg'

2023-08-11 18:37:16,008 [639] barman.server ERROR: Impossible to start the backup. Check the log for more details, or run 'barman check pg'

 

This is what I get from running diagnose:

barman@barman-795d88d965-7l7nh:/var/lib/barman$ barman diagnose

2023-08-11 18:41:01,333 [681] Command WARNING: /bin/sh: 1: lsb_release: not found

2023-08-11 18:41:01,457 [681] Command WARNING: /bin/sh: 1: python: not found

2023-08-11 18:41:01,556 [681] Command WARNING: OpenSSH_8.4p1 Debian-5+deb11u1, OpenSSL 1.1.1n  15 Mar 2022

{

    "global": {

        "config": {

            "backup_method": "postgres",

            "backup_options": "concurrent_backup",

            "barman_home": "/var/lib/barman",

            "barman_user": "barman",

            "compression": "gzip",

            "configuration_files_directory": "/etc/barman/barman.d",

            "errors_list": [],

            "last_backup_maximum_age": "1 day",

            "log_file": "",

            "log_level": "INFO",

            "minimum_redundancy": "1",

            "post_backup_retry_script": "",

            "pre_recovery_retry_script": "",

            "retention_policy": "RECOVERY WINDOW of 1 MONTH",

            "streaming_archiver": "on"

        },

        "system_info": {

            "barman_ver": "3.4.1",

            "kernel_ver": "Linux barman-795d88d965-7l7nh 5.15.49-linuxkit-pr #1 SMP PREEMPT Thu May 25 07:27:39 UTC 2023 x86_64 GNU/Linux",

            "python_ver": "",

            "release": "Debian GNU/Linux 11.7",

            "rsync_ver": "rsync  version 3.2.3  protocol version 31",

            "ssh_ver": "",

            "timestamp": "2023-08-11T18:41:01.561429+00:00"

        }

    },

    "servers": {

        "pg": {

            "backups": {},

            "config": {

                "active": true,

                "archiver": false,

                "archiver_batch_size": 0,

                "backup_compression": null,

                "backup_compression_format": null,

                "backup_compression_level": null,

                "backup_compression_location": null,

                "backup_compression_workers": null,

                "backup_directory": "/var/lib/barman/pg",

                "backup_method": "postgres",

                "backup_options": "concurrent_backup",

                "bandwidth_limit": null,

                "barman_home": "/var/lib/barman",

                "barman_lock_directory": "/var/lib/barman",

                "basebackup_retry_sleep": 30,

                "basebackup_retry_times": 0,

                "basebackups_directory": "/var/lib/barman/pg/base",

                "check_timeout": 30,

                "compression": "gzip",

                "conninfo": "host=pg-postgresql-ha-postgresql user=barman dbname=postgres",

                "create_slot": "manual",

                "custom_compression_filter": null,

                "custom_compression_magic": null,

                "custom_decompression_filter": null,

                "description": "PostgreSQL Database (Streaming-Only)",

                "disabled": false,

                "errors_directory": "/var/lib/barman/pg/errors",

                "forward_config_path": false,

                "immediate_checkpoint": false,

                "incoming_wals_directory": "/var/lib/barman/pg/incoming",

                "last_backup_maximum_age": "1 day",

                "last_backup_minimum_size": null,

                "last_wal_maximum_age": null,

                "max_incoming_wals_queue": null,

                "minimum_redundancy": 1,

                "msg_list": [],

                "name": "pg",

                "network_compression": false,

                "parallel_jobs": 1,

                "path_prefix": "/usr/lib/postgresql/15/bin",

                "post_archive_retry_script": null,

                "post_archive_script": null,

                "post_backup_retry_script": null,

                "post_backup_script": null,

                "post_delete_retry_script": null,

                "post_delete_script": null,

                "post_recovery_retry_script": null,

                "post_recovery_script": null,

                "post_wal_delete_retry_script": null,

                "post_wal_delete_script": null,

                "pre_archive_retry_script": null,

                "pre_archive_script": null,

                "pre_backup_retry_script": null,

                "pre_backup_script": null,

                "pre_delete_retry_script": null,

                "pre_delete_script": null,

                "pre_recovery_retry_script": null,

                "pre_recovery_script": null,

                "pre_wal_delete_retry_script": null,

                "pre_wal_delete_script": null,

                "primary_conninfo": null,

                "primary_ssh_command": null,

                "recovery_options": "",

                "recovery_staging_path": null,

                "retention_policy": "window 1 m",

                "retention_policy_mode": "auto",

                "reuse_backup": null,

                "slot_name": "barman",

                "snapshot_disks": null,

                "snapshot_gcp_project": null,

                "snapshot_instance": null,

                "snapshot_provider": null,

                "snapshot_zone": null,

                "ssh_command": null,

                "streaming_archiver": true,

                "streaming_archiver_batch_size": 50,

                "streaming_archiver_name": "barman_receive_wal",

                "streaming_backup_name": "barman_streaming_backup",

                "streaming_conninfo": "host=pg-postgresql-ha-postgresql user=streaming_barman",

                "streaming_wals_directory": "/var/lib/barman/pg/streaming",

                "tablespace_bandwidth_limit": null,

                "wal_retention_policy": "simple-wal 1 m",

                "wals_directory": "/var/lib/barman/pg/wals"

            },

            "status": {

                "archive_timeout": 0,

                "checkpoint_timeout": 300,

                "config_file": "/opt/bitnami/postgresql/conf/postgresql.conf",

                "connection_error": null,

                "current_lsn": "0/30CF9108",

                "current_size": 321173440.0,

                "current_xlog": null,

                "data_checksums": "off",

                "data_directory": "/bitnami/postgresql/data",

                "has_backup_privileges": true,

                "hba_file": "/opt/bitnami/postgresql/conf/pg_hba.conf",

                "hot_standby": "on",

                "ident_file": "/bitnami/postgresql/data/pg_ident.conf",

                "included_files": [

                    "/bitnami/postgresql/data/postgresql.auto.conf",

                    "/opt/bitnami/postgresql/conf/conf.d/override.conf"

                ],

                "is_in_recovery": true,

                "is_superuser": true,

                "max_replication_slots": "10",

                "max_wal_senders": "10",

                "pg_basebackup_bwlimit": true,

                "pg_basebackup_compatible": true,

                "pg_basebackup_installed": true,

                "pg_basebackup_path": "/usr/lib/postgresql/15/bin/pg_basebackup",

                "pg_basebackup_tbls_mapping": true,

                "pg_basebackup_version": "15.3",

                "pg_receivexlog_compatible": true,

                "pg_receivexlog_installed": true,

                "pg_receivexlog_path": "/usr/lib/postgresql/15/bin/pg_receivewal",

                "pg_receivexlog_supports_slots": true,

                "pg_receivexlog_synchronous": false,

                "pg_receivexlog_version": "15.3",

                "postgres_systemid": "7266128669602238590",

                "replication_slot": [

                    "barman",

                    true,

                    "0/30000000"

                ],

                "replication_slot_support": true,

                "server_txt_version": "15.3",

                "streaming": true,

                "streaming_supported": true,

                "streaming_systemid": "7266128669602238590",

                "synchronous_standby_names": [

                    ""

                ],

                "timeline": 1,

                "version_supported": true,

                "wal_compression": "off",

                "wal_keep_size": "64MB",

                "wal_level": "replica",

                "xlog_segment_size": 16777216,

                "xlogpos": "0/30CF9108"

            },

            "wals": {

                "last_archived_wal_per_timeline": {

                    "00000001": {

                        "compression": "gzip",

                        "name": "00000001000000000000002F",

                        "size": 2045143,

                        "time": 1691778204.509324

                    }

                }

            }

        }

    }

}

 From what I see Barman server (pod) receives WAL files.

If I execute from Barman the command:

barman@barman-795d88d965-7l7nh:/var/lib/barman$ psql -U streaming_barman -h pg-postgresql-ha-postgresql -c "IDENTIFY_SYSTEM" replication=1

      systemid       | timeline |  xlogpos  | dbname 

---------------------+----------+-----------+--------

 7266128669602238590 |        1 | 0/8DD9EF8 | 

(1 row)

it works, so the streaming connection between Barman and Postgresql is working.

I need to mention that Barman is connected to the primary node and not to pgpool. I tried also connecting to pgpool but didn't work at all and I've read somewhere that it can't work with pgpool. Is this true?

Is my setup correct? 

What could be the issue? Can you help?

I can provide the value files for the 2 Helm charts and the commands i've used, if you want reproduce on any K8s cluster.

Thank you!

Vali

Mike Wallace

unread,
Aug 23, 2023, 11:46:52 AM8/23/23
to pgba...@googlegroups.com
Hi Vali,

My best guess here is that. because pg-postgresql-ha-postgresql is a kubernetes service which has both the primary and two standbys as endpoints, Barman is fetching the remote status from a different PostgreSQL server than the one where the replication slot has been created. Because Barman is not cluster-aware it assumes this means there is no replication slot at all and so refuses to take a backup.

You could try setting an individual pod in the conninfo so that Barman is always talking to the same PostgreSQL instance however this wouldn't be very idiomatic kubernetes - it might help as a short term workaround though.

Hope this helps,

Mike


--
--
You received this message because you are subscribed to the "Barman for PostgreSQL" group.
To post to this group, send email to pgba...@googlegroups.com
To unsubscribe from this group, send email to
pgbarman+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/pgbarman?hl=en?hl=en-GB

---
You received this message because you are subscribed to the Google Groups "Barman, Backup and Recovery Manager for PostgreSQL" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pgbarman+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/pgbarman/b41bdd76-17b5-477d-b362-4adbb260ab5dn%40googlegroups.com.

Valentin Protiuc

unread,
Aug 24, 2023, 2:53:45 PM8/24/23
to Barman, Backup and Recovery Manager for PostgreSQL
Hi Mike!

Thank you a lot for your answer!

Yes, pg-postgresql-ha-postgresql is a kubernetes service and Barman connects to that, not directly to a pod.
Indeed, after setting the replica count to just 1 for my Postgresql-HA server (and having just one pod - only the primary, without the two standbys) everything worked as expected and I managed to do backups and restores.
This is ok for me for a PoC and should be fine on a short term but in a real production environment this solution will not be accepted.

I've went through a tutorial on Posgresql clustering that had Barman setup done (https://www.youtube.com/playlist?list=PLpNYlUeSK_rnanDUNr4KiTlkLTmtqK-sQ) and it was mentioned that all the replicas must have the same setup and configuration done as the primary and I can understand this, but I was expecting that it would be more easy to setup Barman in a cluster or at least to find some documentation on this.

Would be nice (maybe in the future releases) to make Barman aware of the cluster. In this pdf page 26 I see that one solution is to build myself a custom script that should check for the new primary and would adapt Barman configuration and create the replication slot on the new primary... but maybe this could be build somehow directly in Barman...

Again, thanks for your hint, it helped me with my PoC! 

Vali

Mike Wallace

unread,
Aug 25, 2023, 5:19:20 AM8/25/23
to pgba...@googlegroups.com
Hi Vali,

Glad I was of some help and thank you for the links - I wasn't aware of that PostgreSQL clustering tutorial so I'm watching the Barman section today :)

The custom scripting suggested in the PDF you linked is exactly the kind of thing we're planning to build into Barman. The idea is to provide commands that cluster management tools can use to tell Barman there is a new primary - Barman would then update its configuration and take care of any changes to WAL streaming processes which may be required. We don't have an ETA for this yet but it is being actively worked on.

Best regards,

Mike

boncalo mihai

unread,
Jun 12, 2024, 1:29:09 AM6/12/24
to Barman, Backup and Recovery Manager for PostgreSQL
Hi Mike,
I'm having the same issue.
Was this implemented in Barman  ?

Thank you,
Mihai Boncalo.

Martin Marques

unread,
Jun 17, 2024, 12:27:01 PM6/17/24
to pgba...@googlegroups.com
Hi Mihai,

Yes this was implemented and released in Barman 3.10 as configuration
models. You can read more about this here:
https://docs.pgbarman.org/release/3.10.1/#configuration

We've already pushed changes to Patroni that help use barman and do
proper switching between nodes after failovers or switchovers.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/pgbarman/0ca10fa8-6b12-4a31-a756-d8a9a887c91bn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages