BARMAN ERROR: failed to STOP

1,873 views
Skip to first unread message

csar...@sardini.net

unread,
Dec 14, 2016, 8:55:49 AM12/14/16
to Barman, Backup and Recovery Manager for PostgreSQL
From two day to now I can't succesfully backup my Postgresql DB, the simple backup command is:

barman backup pgcluster

the result is (at the end of backup)

2016-12-14 14:31:45,328 [27392] barman.backup_executor INFO: Copy done.
2016-12-14 14:31:45,331 [27392] barman.backup_executor INFO: Asking PostgreSQL server to finalize the backup.
2016-12-14 14:31:45,333 [27392] barman.backup ERROR: Backup failed issuing stop backup command.
DETAILS: Cannot terminate exclusive backup. You might have to manually execute pg_stop_backup on your PostgreSQL server

so my backup list is

barman@pgproxy02:~$ barman list-backup pgcluster
pgcluster 20161214T101939 - FAILED
pgcluster 20161123T021012 - Wed Nov 23 04:44:59 2016 - Size: 179.3 GiB - WAL Size: 124.4 GiB
barman@pgproxy02:~$

If log-on my postgresql and manually stop the backup I have:

psql (9.5.5)
Type "help" for help.

postgres=# select pg_stop_backup();
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
----------------
 1BA/DE002DE8
(1 row)


No errors found on "pgcluster" host syslog and ...

barman@pgproxy02:~$ barman check pgcluster
Server pgcluster:
        PostgreSQL: OK
        superuser: OK
        wal_level: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        compression settings: OK
        failed backups: FAILED (there are 1 failed backups)
        minimum redundancy requirements: OK (have 1 backups, expected at least 0)
        ssh: OK (PostgreSQL server)
        not in recovery: OK
        archive_mode: OK
        archive_command: OK
        continuous archiving: OK
        archiver errors: OK

How can I troubleshoot this case?

Thank you in advance and sorry for my poor english.
Claudio.
Message has been deleted

csar...@sardini.net

unread,
Dec 15, 2016, 3:09:00 AM12/15/16
to Barman, Backup and Recovery Manager for PostgreSQL
This issue is similar to https://groups.google.com/d/msg/pgbarman/WCC62IPn0NU/FrBvKl13BwAJ.....

I change log_level=DEBUG too.

I'm waiting the backup to complete but I already found a mistake in configuration....

[pgcluster]
description =  "Main Patroni PostgreSQL Database"
backup_method = rsync
ssh_command = ssh post...@10.101.10.38
conninfo = host=10.101.10.34 port=5000 user=postgres password=*************
retention_policy = REDUNDANCY 3

I'm using Patroni high availability solution but I use the IP address of only one node in ssh_command instead of work with VIP address!!!! So the backup end successfully if the node used in ssh_command is the master node, but fail if the node is a slave replica.

I will update this post if I discover other relevant things.

Thank you all, hoping this can help someone else.

csar...@sardini.net

unread,
Jan 5, 2017, 8:39:41 AM1/5/17
to Barman, Backup and Recovery Manager for PostgreSQL
Problem persist....

Snapshot from barman log (DEBUG mode)
........
>f+++++++++ pg_subtrans/04F1

2017-01-05 13:01:12,227 [16918] barman.command_wrappers DEBUG: Command stderr:
2017-01-05 13:01:12,228 [16918] barman.copy_controller INFO: Smart copy step 4/4: copy with checksums
2017-01-05 13:01:12,228 [16918] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKno
wnHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChe
cking=no'", '-z', '--exclude=/pg_xlog/*', '--exclude=/pg_log/*', '--exclude=/recovery.conf', '--exclude=/postmaster.pid', '-rLK
pts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', '--checksum', '--files-from=/tmp/barman-keMlL
T/check.list', ' :/usr/local/pgsql/data/pgclus/', '/var/lib/barman/data/pgcluster/base/20170105T082504/data']
2017-01-05 13:01:20,619 [16918] barman.command_wrappers DEBUG: Command return code: 0
2017-01-05 13:01:20,620 [16918] barman.command_wrappers DEBUG: Command stdout:
2017-01-05 13:01:20,621 [16918] barman.command_wrappers DEBUG: Command stderr:
2017-01-05 13:01:20,622 [16918] barman.copy_controller INFO: Smart copy finished: :/usr/local/pgsql/data/pgclus/ -> /var/lib/barman/data/pgcluster/base/20170105T082504/data (safe before None)
2017-01-05 13:01:20,623 [16918] barman.copy_controller INFO: Copying remote pg_control file: /usr/local/pgsql/data/pgclus/global/pg_control
2017-01-05 13:01:20,623 [16918] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', ' :/usr/local/pgsql/data/pgclus/global/pg_control', '/var/lib/barman/data/pgcluster/base/20170105T082504/data/global/pg_control']
2017-01-05 13:01:21,694 [16918] barman.command_wrappers DEBUG: Command return code: 0
2017-01-05 13:01:21,695 [16918] barman.command_wrappers DEBUG: Command stdout: >f..t...... pg_control

2017-01-05 13:01:21,695 [16918] barman.command_wrappers DEBUG: Command stderr:
2017-01-05 13:01:21,700 [16918] barman.backup_executor INFO: Copy done.
2017-01-05 13:01:21,706 [16918] barman.backup_executor INFO: This is the first backup for server pgcluster
2017-01-05 13:01:21,826 [16918] barman.backup_executor INFO:    000002060000028D0000001C from server pgcluster has been removed
2017-01-05 13:01:21,827 [16918] barman.backup_executor INFO: Asking PostgreSQL server to finalize the backup.
2017-01-05 13:01:21,827 [16918] barman.backup_executor DEBUG: Stop of exclusive backup
2017-01-05 13:01:21,828 [16918] barman.postgres DEBUG: Error issuing pg_stop_backup command: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
2017-01-05 13:01:21,829 [16918] barman.backup ERROR: Backup failed issuing stop backup command.
DETAILS: Cannot terminate exclusive backup. You might have to manually execute pg_stop_backup on your PostgreSQL server

2017-01-05 13:01:21,840 [16918] barman.server DEBUG: Starting archive-wal for server pgcluster
2017-01-05 13:01:21,843 [16918] barman.wal_archiver INFO: No xlog segments found from file archival for pgcluster.
..................

I've to manually terminate backup on server with pg_stop_backup() command and it stop the backup with no errors, but the backup is FAILED so i can't use it.
I don't know whatelse I could inspect to troubleshooting this issue.

Any help is welcome.

Thank you,
Claudio.

csar...@sardini.net

unread,
Jan 19, 2017, 8:40:54 AM1/19/17
to Barman, Backup and Recovery Manager for PostgreSQL
The following snapshot is an extract of the log file for the last backup job.

The problem is still the same. ERROR: backup failed issuing stop backup command.

If I stop the backup no problem is reported:

root@pgcluster02:~# psql -U postgres -h 10.101.10.39 -p 5432
Password for user postgres:
psql (9.5.5)
Type "help" for help.

postgres=# select pg_stop_backup();
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
----------------
 43D/88706220
(1 row)


I'm unable to figure out what going wrong.

Any idea?

Any help is welcome.

..........
extract from barman.log:
2017-01-18 12:54:08,537 [24747] barman.config DEBUG: Including configuration file: pgcluster.conf
2017-01-18 12:54:08,537 [24747] barman.cli DEBUG: Initialised Barman version 2.1 (config: /etc/barman.conf, args: {'reuse_backup': None, 'server_name': ['pgcluster'], 'format': 'console', 'quiet': False, 'retry_times': None, 'command': 'backup', 'retry_sleep': None, 'debug': False})
2017-01-18 12:54:08,560 [24747] barman.backup_executor DEBUG: The default backup strategy for any ssh based backup_method is: exclusive_backup
2017-01-18 12:54:08,560 [24747] barman.server DEBUG: Retention policy for server pgcluster: REDUNDANCY 3
2017-01-18 12:54:08,560 [24747] barman.server DEBUG: WAL retention policy for server pgcluster: MAIN
2017-01-18 12:54:08,639 [24747] barman.server DEBUG: Check 'PostgreSQL' succeeded for server 'pgcluster'
2017-01-18 12:54:08,639 [24747] barman.server DEBUG: Check 'superuser' succeeded for server 'pgcluster'
2017-01-18 12:54:08,639 [24747] barman.server DEBUG: Check 'wal_level' succeeded for server 'pgcluster'
2017-01-18 12:54:08,640 [24747] barman.server DEBUG: Check 'directories' succeeded for server 'pgcluster'
2017-01-18 12:54:08,640 [24747] barman.server DEBUG: Check 'retention policy settings' succeeded for server 'pgcluster'
2017-01-18 12:54:08,640 [24747] barman.server DEBUG: Check 'backup maximum age' succeeded for server 'pgcluster'
2017-01-18 12:54:08,640 [24747] barman.server DEBUG: Check 'compression settings' succeeded for server 'pgcluster'
2017-01-18 12:54:08,640 [24747] barman.server DEBUG: Check 'failed backups' succeeded for server 'pgcluster'
2017-01-18 12:54:08,641 [24747] barman.server DEBUG: Check 'minimum redundancy requirements' succeeded for server 'pgcluster'
2017-01-18 12:54:08,641 [24747] barman.command_wrappers DEBUG: Command: "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no' 'true'"
2017-01-18 12:54:10,193 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 12:54:10,194 [24747] barman.command_wrappers DEBUG: Command stdout:
2017-01-18 12:54:10,194 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 12:54:10,195 [24747] barman.server DEBUG: Check 'ssh' succeeded for server 'pgcluster'
2017-01-18 12:54:10,197 [24747] barman.server DEBUG: Check 'not in recovery' succeeded for server 'pgcluster'
2017-01-18 12:54:10,197 [24747] barman.server DEBUG: Check 'archive_mode' succeeded for server 'pgcluster'
2017-01-18 12:54:10,198 [24747] barman.server DEBUG: Check 'archive_command' succeeded for server 'pgcluster'
2017-01-18 12:54:10,198 [24747] barman.server DEBUG: Check 'continuous archiving' succeeded for server 'pgcluster'
2017-01-18 12:54:10,198 [24747] barman.server DEBUG: Check 'archiver errors' succeeded for server 'pgcluster'
2017-01-18 12:54:10,199 [24747] barman.backup DEBUG: initialising backup information
2017-01-18 12:54:10,202 [24747] barman.backup INFO: Starting backup using rsync-exclusive method for server pgcluster in /var/lib/barman/data/pgcluster/base/20170118T125410
2017-01-18 12:54:10,203 [24747] barman.backup_executor DEBUG: detecting data directory
2017-01-18 12:54:10,213 [24747] barman.backup_executor DEBUG: detecting tablespaces
2017-01-18 12:54:10,228 [24747] barman.backup_executor DEBUG: issuing start backup command
2017-01-18 12:54:10,229 [24747] barman.backup_executor DEBUG: Start of exclusive backup
...
2017-01-18 12:56:26,042 [24747] barman.backup_executor INFO:    0000020D000003C000000002 from server pgcluster has been removed
2017-01-18 12:56:26,043 [24747] barman.backup_executor INFO:    0000020D000003C000000003 from server pgcluster has been removed
...
2017-01-18 12:56:26,985 [24747] barman.backup_executor INFO:    0000020D000003C3000000C9 from server pgcluster has been removed
2017-01-18 12:56:26,986 [24747] barman.backup_executor INFO: Copying files.
2017-01-18 12:56:26,988 [24747] barman.infofile DEBUG: Config file 'postgresql.conf' already in PGDATA
2017-01-18 12:56:26,988 [24747] barman.infofile DEBUG: Config file 'pg_hba.conf' already in PGDATA
2017-01-18 12:56:26,988 [24747] barman.infofile DEBUG: Config file 'pg_ident.conf' already in PGDATA
2017-01-18 12:56:26,991 [24747] barman.copy_controller INFO: Copy started (safe before None)
2017-01-18 12:56:26,991 [24747] barman.copy_controller INFO: Copy step 1 of 5: analyze remote PGDATA directory: /usr/local/pgsql/data/p
gclus/
2017-01-18 12:56:26,991 [24747] barman.copy_controller DEBUG: list_files: '/var/lib/barman/data/pgcluster/base/20170118T125410/data/'
2017-01-18 12:56:26,992 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsF
ile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z'
, '--exclude=/pg_log/*', '--exclude=/pg_xlog/*', '--exclude=/pg_wal/*', '--exclude=/global/pg_control', '--exclude=pgsql_tmp*', '--excl
ude=postgresql.auto.conf.tmp', '--exclude=postmaster.pid', '--exclude=postmaster.opts', '--exclude=recovery.conf', '--exclude=pg_dynshm
em/*', '--exclude=pg_notify/*', '--exclude=pg_replslot/*', '--exclude=pg_serial/*', '--exclude=pg_stat_tmp/*', '--exclude=pg_snapshots/
*', '--exclude=pg_subtrans/*', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', '--no-human-reada
ble', '--list-only', '-r', '/var/lib/barman/data/pgcluster/base/20170118T125410/data/']
2017-01-18 12:56:27,039 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 12:56:27,040 [24747] barman.command_wrappers DEBUG: Command stdout: drwxrwxr-x        4096 2017/01/18 12:56:26 .

2017-01-18 12:56:27,040 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 12:56:27,041 [24747] barman.copy_controller DEBUG: list_files: ':/usr/local/pgsql/data/pgclus/'
2017-01-18 12:56:27,041 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsF
ile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z'
, '--exclude=/pg_log/*', '--exclude=/pg_xlog/*', '--exclude=/pg_wal/*', '--exclude=/global/pg_control', '--exclude=pgsql_tmp*', '--excl
ude=postgresql.auto.conf.tmp', '--exclude=postmaster.pid', '--exclude=postmaster.opts', '--exclude=recovery.conf', '--exclude=pg_dynshm
em/*', '--exclude=pg_notify/*', '--exclude=pg_replslot/*', '--exclude=pg_serial/*', '--exclude=pg_stat_tmp/*', '--exclude=pg_snapshots/
*', '--exclude=pg_subtrans/*', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', '--no-human-reada
ble', '--list-only', '-r', ' :/usr/local/pgsql/data/pgclus/']
2017-01-18 12:56:29,878 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 12:56:29,882 [24747] barman.command_wrappers DEBUG: Command stdout: drwx------        4096 2017/01/18 12:54:11 .
-rw-------         627 2017/01/16 09:54:35 .barman-recover.info
srwxrwxrwx           0 2017/01/18 12:18:48 .s.PGSQL.5432
-rw-------          52 2017/01/18 12:18:48 .s.PGSQL.5432.lock
...
-rw-------      229376 2017/01/18 12:54:10 pg_multixact/offsets/01C4
drwx------        4096 2017/01/16 10:00:14 pg_notify
drwx------        4096 2017/01/16 10:29:50 pg_replslot
drwx------        4096 2017/01/09 11:14:22 pg_serial
drwx------        4096 2017/01/09 13:26:36 pg_snapshots
drwx------        4096 2017/01/16 10:00:14 pg_stat
drwx------        4096 2017/01/18 12:56:24 pg_stat_tmp
drwx------        4096 2017/01/18 12:36:37 pg_subtrans
drwx------        4096 2017/01/09 11:14:22 pg_tblspc
drwx------        4096 2017/01/09 11:14:43 pg_twophase
drwx------     1683456 2017/01/18 12:54:11 pg_xlog

2017-01-18 12:56:29,887 [24747] barman.command_wrappers DEBUG: Command stderr:

2017-01-18 12:56:30,589 [24747] barman.copy_controller INFO: Copy step 2 of 5: create destination directories and delete unknown files for remote PGDATA directory: /usr/local/pgsql/data/pgclus/
2017-01-18 12:56:30,589 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z', '--exclude=/pg_log/*', '--exclude=/pg_xlog/*', '--exclude=/pg_wal/*', '--exclude=/global/pg_control', '--exclude=pgsql_tmp*', '--exclude=postgresql.auto.conf.tmp', '--exclude=postmaster.pid', '--exclude=postmaster.opts', '--exclude=recovery.conf', '--exclude=pg_dynshmem/*', '--exclude=pg_notify/*', '--exclude=pg_replslot/*', '--exclude=pg_serial/*', '--exclude=pg_stat_tmp/*', '--exclude=pg_snapshots/*', '--exclude=pg_subtrans/*', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', '--recursive', '--delete', '--files-from=/tmp/barman-z5PSG7/pgdata_dir.list', '--filter', 'merge /tmp/barman-z5PSG7/pgdata_exclude_and_protect.filter', ' :/usr/local/pgsql/data/pgclus/', '/var/lib/barman/data/pgcluster/base/20170118T125410/data']
2017-01-18 12:56:32,121 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 12:56:32,122 [24747] barman.command_wrappers DEBUG: Command stdout: .d..tp..... ./
cd+++++++++ base/
cd+++++++++ base/1/
...
cd+++++++++ pg_xlog/

2017-01-18 12:56:32,122 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 12:56:32,122 [24747] barman.copy_controller INFO: Copy step 3 of 5: copy safe files from remote PGDATA directory: /usr/local/pgsql/data/pgclus/
2017-01-18 12:56:32,122 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z', '--exclude=/pg_log/*', '--exclude=/pg_xlog/*', '--exclude=/pg_wal/*', '--exclude=/global/pg_control', '--exclude=pgsql_tmp*', '--exclude=postgresql.auto.conf.tmp', '--exclude=postmaster.pid', '--exclude=postmaster.opts', '--exclude=recovery.conf', '--exclude=pg_dynshmem/*', '--exclude=pg_notify/*', '--exclude=pg_replslot/*', '--exclude=pg_serial/*', '--exclude=pg_stat_tmp/*', '--exclude=pg_snapshots/*', '--exclude=pg_subtrans/*', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', ' :/usr/local/pgsql/data/pgclus/', '/var/lib/barman/data/pgcluster/base/20170118T125410/data', '--files-from=/tmp/barman-z5PSG7/pgdata_safe.list']
2017-01-18 12:57:01,509 [6703] barman.config DEBUG: Including configuration file: pgcluster.conf

...
2017-01-18 16:05:32,209 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 16:05:32,212 [24747] barman.command_wrappers DEBUG: Command stdout: skipping non-regular file ".s.PGSQL.5432"
>f+++++++++ .barman-recover.info
>f+++++++++ .s.PGSQL.5432.lock
...
>f+++++++++ pg_multixact/offsets/01C2
>f+++++++++ pg_multixact/offsets/01C3
>f+++++++++ pg_multixact/offsets/01C4

2017-01-18 16:05:32,215 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 16:05:32,215 [24747] barman.copy_controller INFO: Copy step 4 of 5: copy files with checksum from remote PGDATA directory: /
usr/local/pgsql/data/pgclus/
2017-01-18 16:05:32,215 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z'
, '--exclude=/pg_log/*', '--exclude=/pg_xlog/*', '--exclude=/pg_wal/*', '--exclude=/global/pg_control', '--exclude=pgsql_tmp*', '--exclude=postgresql.auto.conf.tmp', '--exclude=postmaster.pid', '--exclude=postmaster.opts', '--exclude=recovery.conf', '--exclude=pg_dynshmem/*', '--exclude=pg_notify/*', '--exclude=pg_replslot/*', '--exclude=pg_serial/*', '--exclude=pg_stat_tmp/*', '--exclude=pg_snapshots/*', '--exclude=pg_subtrans/*', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', ' :/usr/local/pgsql/data/pgclus/', '/var/lib/barman/data/pgcluster/base/20170118T125410/data', '--files-from=/tmp/barman-z5PSG7/pgdata_check.list', '--checksum']
...
2017-01-18 16:05:35,619 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 16:05:35,620 [24747] barman.command_wrappers DEBUG: Command stdout:
2017-01-18 16:05:35,620 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 16:05:35,621 [24747] barman.copy_controller INFO: Copy step 5 of 5: copy remote pg_control file: /usr/local/pgsql/data/pgclus/global/pg_control
2017-01-18 16:05:35,621 [24747] barman.command_wrappers DEBUG: Command: ['/usr/bin/rsync', '-e', "ssh '-p' '5021' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'StrictHostKeyChecking=no' 'post...@10.101.10.34' '-q' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-z', '-rLKpts', '--delete-excluded', '--inplace', '--itemize-changes', '--itemize-changes', ' :/usr/local/pgsql/data/pgclus/global/pg_control', '/var/lib/barman/data/pgcluster/base/20170118T125410/data/global/pg_control']
2017-01-18 16:05:36,838 [24747] barman.command_wrappers DEBUG: Command return code: 0
2017-01-18 16:05:36,839 [24747] barman.command_wrappers DEBUG: Command stdout: >f+++++++++ pg_control

2017-01-18 16:05:36,839 [24747] barman.command_wrappers DEBUG: Command stderr:
2017-01-18 16:05:36,840 [24747] barman.copy_controller INFO: Copy finished (safe before None)
2017-01-18 16:05:36,843 [24747] barman.backup_executor INFO: Copy done.
2017-01-18 16:05:36,850 [24747] barman.backup_executor INFO: This is the first backup for server pgcluster

2017-01-18 16:14:33,790 [24747] barman.backup_executor INFO:    0000020D000003C3000000CA from server pgcluster has been removed
2017-01-18 16:14:33,792 [24747] barman.backup_executor INFO:    0000020D000003C3000000CB from server pgcluster has been removed
...
2017-01-18 16:14:48,108 [24747] barman.backup_executor INFO:    0000020D000003FC000000E3 from server pgcluster has been removed
2017-01-18 16:14:48,110 [24747] barman.backup_executor INFO: Asking PostgreSQL server to finalize the backup.
2017-01-18 16:14:48,110 [24747] barman.backup_executor DEBUG: Stop of exclusive backup
2017-01-18 16:14:48,111 [24747] barman.postgres DEBUG: Error issuing pg_stop_backup command: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
2017-01-18 16:14:48,112 [24747] barman.backup ERROR: Backup failed issuing stop backup command.
Message has been deleted

csar...@sardini.net

unread,
Jan 30, 2017, 5:01:54 AM1/30/17
to Barman, Backup and Recovery Manager for PostgreSQL
UPDATE:

I try to solve the problem changing the backup method. So i configure barman to use backup_method = postgre and all the relative parameters like streaming_conninfo and so on.
Then:
barman@pgproxy02:~$ barman check pgcluster_a
Server pgcluster_a:
        PostgreSQL: OK
        superuser: OK
        PostgreSQL streaming: OK
        wal_level: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        compression settings: OK
        failed backups: FAILED (there are 2 failed backups)
        minimum redundancy requirements: OK (have 1 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        archive_mode: OK
        archive_command: OK
        continuous archiving: OK
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: FAILED (unknown: 5)

The "archive errors: FAILED (unknow :5)" is relative to a nagios port sniffing not complete configured.
I try backup again and the result is similar to previous situation. Backup FAILED at the end of the process.

Following the extract of barman.log:

2017-01-30 09:05:24,577 [31470] barman.config DEBUG: Including configuration file: pgcluster.conf
2017-01-30 09:05:24,578 [31470] barman.cli DEBUG: Initialised Barman version 2.1 (config: /etc/barman.conf, args: {'reuse_backup': None
, 'server_name': ['pgcluster_a'], 'format': 'console', 'quiet': False, 'retry_times': None, 'command': 'backup', 'retry_sleep': None, '
debug': False})
2017-01-30 09:05:24,607 [31470] barman.backup_executor DEBUG: The default backup strategy for postgres backup_method is: concurrent_bac
kup
2017-01-30 09:05:24,607 [31470] barman.server DEBUG: Retention policy for server pgcluster_a: REDUNDANCY 3
2017-01-30 09:05:24,608 [31470] barman.server DEBUG: WAL retention policy for server pgcluster_a: MAIN
2017-01-30 09:05:24,755 [31470] barman.command_wrappers DEBUG: Command: ['/usr/bin/pg_receivexlog', '--version']
2017-01-30 09:05:24,820 [31470] barman.command_wrappers DEBUG: Command return code: 0
2017-01-30 09:05:24,820 [31470] barman.command_wrappers DEBUG: Command stdout: pg_receivexlog (PostgreSQL) 9.6.1

2017-01-30 09:05:24,821 [31470] barman.command_wrappers DEBUG: Command stderr:
2017-01-30 09:05:24,823 [31470] barman.wal_archiver DEBUG: Look for 'barman_receive_wal' in 'synchronous_standby_names': ['']
2017-01-30 09:05:24,823 [31470] barman.wal_archiver INFO: Synchronous WAL streaming for barman_receive_wal: False
2017-01-30 09:05:24,824 [31470] barman.command_wrappers DEBUG: Command: ['/usr/bin/pg_basebackup', '--version']
2017-01-30 09:05:24,911 [31470] barman.command_wrappers DEBUG: Command return code: 0
2017-01-30 09:05:24,912 [31470] barman.command_wrappers DEBUG: Command stdout: pg_basebackup (PostgreSQL) 9.6.1

2017-01-30 09:05:24,912 [31470] barman.command_wrappers DEBUG: Command stderr:
2017-01-30 09:05:24,913 [31470] barman.server DEBUG: Check 'PostgreSQL' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,913 [31470] barman.server DEBUG: Check 'superuser' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,913 [31470] barman.server DEBUG: Check 'PostgreSQL streaming' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,913 [31470] barman.server DEBUG: Check 'wal_level' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,914 [31470] barman.server DEBUG: Check 'directories' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,914 [31470] barman.server DEBUG: Check 'retention policy settings' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,914 [31470] barman.server DEBUG: Check 'backup maximum age' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,914 [31470] barman.server DEBUG: Check 'compression settings' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,917 [31470] barman.server INFO: Ignoring failed check 'failed backups' for server 'pgcluster_a'
2017-01-30 09:05:24,917 [31470] barman.server DEBUG: Check 'minimum redundancy requirements' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,917 [31470] barman.server DEBUG: Check 'pg_basebackup' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,917 [31470] barman.server DEBUG: Check 'pg_basebackup compatible' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'pg_basebackup supports tablespaces mapping' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'archive_mode' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'archive_command' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'continuous archiving' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'pg_receivexlog' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,923 [31470] barman.server DEBUG: Check 'pg_receivexlog compatible' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,924 [31470] barman.server DEBUG: Check 'receive-wal running' succeeded for server 'pgcluster_a'
2017-01-30 09:05:24,924 [31470] barman.server INFO: Ignoring failed check 'archiver errors' for server 'pgcluster_a'
2017-01-30 09:05:24,924 [31470] barman.backup DEBUG: initialising backup information
2017-01-30 09:05:24,926 [31470] barman.backup INFO: Starting backup using postgres method for server pgcluster_a in /var/lib/barman/data/pgcluster_a/base/20170130T090524
2017-01-30 09:05:24,928 [31470] barman.backup_executor DEBUG: detecting data directory
2017-01-30 09:05:24,931 [31470] barman.backup_executor DEBUG: detecting tablespaces
2017-01-30 09:05:24,932 [31470] barman.backup_executor DEBUG: issuing start backup command
2017-01-30 09:05:24,936 [31470] barman.backup_executor INFO: Backup start at xlog location: 6F/5C024810 (000000190000006F0000005C, 00024810)
2017-01-30 09:05:24,937 [31470] barman.backup_executor INFO: Copying files.
2017-01-30 09:05:24,937 [31470] barman.command_wrappers DEBUG: Command: ['/usr/bin/pg_basebackup', '--dbname=dbname=replication host=10.101.10.34 port=5001 replication=true user=stream_barman application_name=barman_streaming_backup', '-v', '--no-password', '--pgdata=/var/lib/barman/data/pgcluster_a/base/20170130T090524/data', '--checkpoint=fast']
....
2017-01-30 09:38:29,774 [31470] barman.command_wrappers DEBUG: Command stderr: WARNING:  skipping special file "./.s.PGSQL.5532"
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
pg_basebackup: base backup completed

2017-01-30 09:38:29,777 [31470] barman.backup_executor INFO: Copy done.
2017-01-30 09:38:29,793 [31470] barman.backup_executor INFO: Finalising the backup.
2017-01-30 09:38:29,794 [31470] barman.postgres DEBUG: Error calling pg_is_in_recovery() function: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
2017-01-30 09:38:29,794 [31470] barman.postgres DEBUG: Error retrieving current xlog detailed information: cursor already closed
2017-01-30 09:38:29,795 [31470] barman.backup ERROR: Backup failed copying files.
DETAILS: 'NoneType' object has no attribute 'copy'

2017-01-30 09:38:29,797 [31470] barman.cli ERROR: connection already closed
See log file for more details.
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/barman/cli.py", line 1041, in main
    p.dispatch(pre_call=global_config)
  File "/usr/lib/python2.7/dist-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/usr/lib/python2.7/dist-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/usr/lib/python2.7/dist-packages/argh/dispatching.py", line 231, in _call
    result = function(namespace_obj)
  File "/usr/lib/python2.7/dist-packages/barman/cli.py", line 208, in backup
    server.backup()
  File "/usr/lib/python2.7/contextlib.py", line 154, in __exit__
    self.thing.close()
  File "/usr/lib/python2.7/dist-packages/barman/server.py", line 388, in close
    self.postgres.close()
  File "/usr/lib/python2.7/dist-packages/barman/postgres.py", line 164, in close
    self._conn.rollback()
InterfaceError: connection already closed
2017-01-30 09:38:29,822 [31470] barman.postgres WARNING: Forcing PostgreSQLConnection cleanup during process shut down.

And th extract of relative log in pg_log:

2017-01-30 09:05:18.866 CET - DEBUG:  received replication command: IDENTIFY_SYSTEM
2017-01-30 09:05:18.866 CET - DEBUG:  received replication command: START_REPLICATION SLOT "pgcluster01a" 6F/5C000000 TIMELINE 25
2017-01-30 09:05:18.868 CET - DEBUG:  standby "pgcluster01a" has now caught up with primary
2017-01-30 09:05:24.661 CET - DEBUG:  received replication command: IDENTIFY_SYSTEM
2017-01-30 09:05:24.993 CET - DEBUG:  received replication command: IDENTIFY_SYSTEM
2017-01-30 09:05:24.994 CET - DEBUG:  received replication command: BASE_BACKUP LABEL 'pg_basebackup base backup'   FAST
2017-01-30 09:05:25.099 CET - DEBUG:  performing replication slot checkpoint
2017-01-30 09:05:26.758 CET - DEBUG:  archived transaction log file "000000190000006F0000005C"
2017-01-30 09:05:39.471 CET - DEBUG:  autovacuum: processing database "postgres"
....
2017-01-30 09:34:40.663 CET - DEBUG:  autovacuum: processing database "postgres"
2017-01-30 09:35:24.930 CET - LOG:  unexpected EOF on client connection with an open transaction
2017-01-30 09:35:25.550 CET - DEBUG:  performing replication slot checkpoint
2017-01-30 09:35:40.721 CET - DEBUG:  autovacuum: processing database "postgres"
2017-01-30 09:36:40.747 CET - DEBUG:  autovacuum: processing database "postgres"
2017-01-30 09:37:40.795 CET - DEBUG:  autovacuum: processing database "postgres"
2017-01-30 09:38:26.601 CET - WARNING:  skipping special file "./.s.PGSQL.5532"
2017-01-30 09:38:28.396 CET - DEBUG:  archived transaction log file "000000190000006F0000005D.00000028.backup"
2017-01-30 09:38:29.465 CET - DEBUG:  archived transaction log file "000000190000006F0000005D"
2017-01-30 09:38:29.765 CET - NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
2017-01-30 09:38:40.835 CET - DEBUG:  autovacuum: processing database "postgres"

Please help me.

Any suggestion is welcome.

Thanks again and sorry for my basic english.

Nathan Foley

unread,
Feb 7, 2017, 3:59:24 AM2/7/17
to Barman, Backup and Recovery Manager for PostgreSQL
Hi Claudio,

I have updated my post that you referenced:

May be of some help to you?

Thanks,
Nath

csar...@sardini.net

unread,
Feb 9, 2017, 2:58:12 AM2/9/17
to Barman, Backup and Recovery Manager for PostgreSQL
Hi Nathan,
maybe, maybe. I also suspect this is an issue related to connection timeout, because my failed backup stand up for more than two hours. I have no VPN and I use two virtual host on proxmox, one for barman, the other for postgresql, in the same network, but sometime flash up some network timeout.

I've seen the "Latest commit 68dcdd0" so I'll wait for next release to see if this would resolve my issue.

Thank you for your suggestion.

Nathan Foley

unread,
Feb 9, 2017, 7:22:17 PM2/9/17
to Barman, Backup and Recovery Manager for PostgreSQL
Dont know much about Proxmox, but if there is a virt f/w / g/w with a TCP timeout < your servers , then it may be tearing down the connection before a keepalive is sent?

Anyway, the commit you point to looks like it should solve your problem :)

csar...@sardini.net

unread,
Jul 25, 2017, 8:25:00 AM7/25/17
to Barman, Backup and Recovery Manager for PostgreSQL
UPDATE:

      BARMAN 2.2 resolve this issue...... The problem was only the timeout in connection to postgreSQL due to long time backup (more than 1 hour). Last release of barman try to reconnect and terminate backup operation correctly.
Reply all
Reply to author
Forward
0 new messages