Backup failed with "error": "failure copying files ([Errno 5] Input/output error)", barman log ok

91 views
Skip to first unread message

Biljana Jovanoski

unread,
Jan 10, 2021, 3:44:36 AM1/10/21
to Barman, Backup and Recovery Manager for PostgreSQL

Hi,

Barman checks for server are all ok, backup seems to do all its supposed to do and at the end of backup in the barman.log we have  (db 1.7TB backup running around 17h), and yet the status of is failed.

2021-01-10 02:51:06,822 [28656] barman.command_wrappers DEBUG: Command stdout:

2021-01-10 02:51:06,822 [28656] barman.command_wrappers DEBUG: Command stderr: pg_basebackup: initiating base backup, waiting for checkpoi$

pg_basebackup: checkpoint completed

NOTICE:  base backup done, waiting for required WAL segments to be archived

NOTICE:  all required WAL segments have been archived

pg_basebackup: syncing data to disk ...

pg_basebackup: base backup completed

 

The server that we do backup of is standby. Latest attempt was with streaming_archiver=off and archiver =on but the same behavior was in the case of setting(that we prefer streaming_archiver=on archiver=off), i have tried both methods:rsync and postgres

in the postgres log there is no error, the only error is in barman diagnose                 

   "error": "failure copying files ([Errno 5] Input/output error)",

 

at one of the attempts(cannot reproduce now) I got a file in /var/crash/_usr_bin_barman.111.crash which i have attached

 

What to do next, what is the last step that barman is doing when it needs to mark the backup, could it be is some permission on files, or some additional setup in the postgres server related to wals, or some other direction that we can check in order to have successful backup

Thank you for your time and hope someone gives a hint on what to try next, below details on confifguration

 

Postgresql.conf

 

 

archive_mode = always           # enables archiving; off, on, or always

archive_command = 'barman-wal-archive barman db3 %p'             # command to use to archive a logfile segment

Server configuration

[db3]

 

active=true

description = “Affectli Standby Live Postgres Server 2”

ssh_command = ssh postgres@db2

conninfo = host=db2 user=barman dbname=postgres port=5432 application_name=barman keepalives=1

streaming_conninfo = host=db2 dbname=postgres user=streaming_barman  port=5432 application_name=barman keepalives=1

backup_method = postgres

backup_options=concurrent_backup

streaming_archiver = off

archiver = on

backup_directory = /opt/affectli/barman/aff_db2

retention_policy_mode = auto

retention_policy = RECOVERY WINDOW OF 7 days

 

wal_retention_policy = main

slot_name=barman_new2

create_slot=auto

 

parallel_jobs = 6

path_prefix= "/usr/lib/postgresql/12/bin"

 

 

barman.conf

compression=pigz

 

 barman show-server db3

Server db3:

        active: True

        archive_command: barman-wal-archive barman db3 %p

        archive_mode: always

        archive_timeout: 0

        archived_count: 36804

        archiver: True

        archiver_batch_size: 0

        backup_directory: /opt/affectli/barman/aff_db2

        backup_method: postgres

        backup_options: BackupOptions({'concurrent_backup'})

        bandwidth_limit: None

        barman_home: /var/lib/barman

        barman_lock_directory: /var/lib/barman

        basebackup_retry_sleep: 30

        basebackup_retry_times: 0

        basebackups_directory: /opt/affectli/barman/aff_db2/base

        check_timeout: 30

        checkpoint_timeout: 300

        compression: pigz

        config_file: /etc/postgresql/12/main/postgresql.conf

        connection_error: None

        conninfo: host=db2 user=barman dbname=postgres port=5432 application_name=barman keepalives=1

        create_slot: auto

        current_archived_wals_per_second: 0.06604091090823545

        current_lsn: 1710/CB401CF8

        current_size: 1798368481051

        current_xlog: None

        custom_compression_filter: None

        custom_decompression_filter: None

        data_checksums: on

        data_directory: /var/lib/postgresql/12/main

        description: “Affectli Standby Live Postgres Server 2”

        disabled: False

        errors_directory: /opt/affectli/barman/aff_db2/errors

        failed_count: 271

        has_backup_privileges: True

        hba_file: /etc/postgresql/12/main/pg_hba.conf

        hot_standby: on

        ident_file: /etc/postgresql/12/main/pg_ident.conf

        immediate_checkpoint: True

        included_files: ['/var/lib/postgresql/12/main/postgresql.auto.conf']

        incoming_wals_directory: /opt/affectli/barman/aff_db2/incoming

        is_archiving: True

        is_in_recovery: True

        is_superuser: True

        last_archived_time: 2021-01-10 09:09:25.936111+01:00

        last_archived_wal: 0000000300001710000000CA

        last_backup_maximum_age: None

        last_failed_time: 2021-01-08 15:25:48.481138+01:00

        last_failed_wal: 00000003000016F20000004E

        max_incoming_wals_queue: None

        max_replication_slots: 10

        max_wal_senders: 10

        minimum_redundancy: 0

        msg_list: []

        name: db3

        network_compression: False

        parallel_jobs: 6

        passive_node: False

        path_prefix: /usr/lib/postgresql/12/bin

        pg_basebackup_bwlimit: True

        pg_basebackup_compatible: True

        pg_basebackup_installed: True

        pg_basebackup_path: /usr/lib/postgresql/12/bin/pg_basebackup

        pg_basebackup_tbls_mapping: True

        pg_basebackup_version: 12.4

        pgespresso_installed: False

        post_archive_retry_script: None

        post_archive_script: None

        post_backup_retry_script: None

        post_backup_script: None

        post_delete_retry_script: None

        post_delete_script: None

        post_recovery_retry_script: None

        post_recovery_script: None

        post_wal_delete_retry_script: None

        post_wal_delete_script: None

        postgres_systemid: 6818593784879557275

        pre_archive_retry_script: None

        pre_archive_script: None

        pre_backup_retry_script: None

        pre_backup_script: None

        pre_delete_retry_script: None

        pre_delete_script: None

        pre_recovery_retry_script: None

        pre_recovery_script: None

        pre_wal_delete_retry_script: None

        pre_wal_delete_script: None

        primary_ssh_command: None

        recovery_options: RecoveryOptions()

        replication_slot: Record(slot_name='barman_new2', active=False, restart_lsn='16FE/D4000000')

        replication_slot_support: True

        retention_policy: RECOVERY WINDOW OF 7 DAYS

        retention_policy_mode: auto

        reuse_backup: None

        server_txt_version: 12.3

        slot_name: barman_new2

        ssh_command: ssh postgres@db2

        stats_reset: 2021-01-03 22:21:18.808274+01:00

        streaming: True

        streaming_archiver: False

        streaming_archiver_batch_size: 0

        streaming_archiver_name: barman_receive_wal

        streaming_backup_name: barman_streaming_backup

        streaming_conninfo: host=db2 dbname=postgres user=streaming_barman  port=5432 application_name=barman keepalives=1

        streaming_supported: True

        streaming_systemid: 6818593784879557275

        streaming_wals_directory: /opt/affectli/barman/aff_db2/streaming

        synchronous_standby_names: ['']

        tablespace_bandwidth_limit: None

        timeline: 3

        wal_compression: on

        wal_keep_segments: 1280

        wal_level: replica

        wal_retention_policy: MAIN

        wals_directory: /opt/affectli/barman/aff_db2/wals

        xlog_segment_size: 16777216

        xlogpos: 1710/CB409990

 

barman check db3

Server db3:

        PostgreSQL: OK

        superuser or standard user with backup privileges: OK

        PostgreSQL streaming: OK

        wal_level: OK

        replication slot: OK

        directories: OK

        retention policy settings: OK

        backup maximum age: OK (no last_backup_maximum_age provided)

        compression settings: OK

        failed backups: OK (there are 0 failed backups)

        minimum redundancy requirements: OK (have 0 backups, expected at least 0)

        ssh: OK (PostgreSQL server)

        systemid coherence: OK

        pg_receivexlog: OK

        pg_receivexlog compatible: OK

        receive-wal running: OK

        archive_mode: OK

        archive_command: OK

        continuous archiving: OK

        archiver errors: OK

 

 

 

Biljana Jovanoski

unread,
Feb 3, 2021, 11:00:37 AM2/3/21
to Barman, Backup and Recovery Manager for PostgreSQL
I got it working, i only changed owner on /etc/barman.d to be barman instead of root

Luca Ferrari

unread,
Feb 4, 2021, 2:47:30 AM2/4/21
to Barman, Backup and Recovery Manager for PostgreSQL
On Wed, Feb 3, 2021 at 5:00 PM 'Biljana Jovanoski' via Barman, Backup
and Recovery Manager for PostgreSQL <pgba...@googlegroups.com> wrote:
>
> I got it working, i only changed owner on /etc/barman.d to be barman instead of root
>

Uhm.. I don't see why changing the configuration directory owner could
prevent rsync to fix an I/O error. I would expect you to change the
owner of the backup directory, such as:
backup_directory = /opt/affectli/barman/aff_db2

Are you sure you have not changed also the above directory owner?

Luca

Biljana Jovanoski

unread,
Feb 4, 2021, 3:02:23 AM2/4/21
to Barman, Backup and Recovery Manager for PostgreSQL
That folder was already with owner barman, but backup was failing until I changed the owner of etc/barman.d

Stefan Badenhorst

unread,
Feb 25, 2022, 4:18:07 PMFeb 25
to Barman, Backup and Recovery Manager for PostgreSQL
This happened again on the same server and I found another post on github that seems to explain the problem better:
It has to do with running the barman backup command from the terminal and then disconnecting the sys.stdout.
In this case I ran the manual backup (usually it is triggered by cron) and at the end of the backup it failed in the same way.

Reply all
Reply to author
Forward
0 new messages