recovery error

727 views
Skip to first unread message

gap

unread,
Mar 11, 2014, 5:19:59 AM3/11/14
to pgba...@googlegroups.com
Hi,

first of all thanks for the barman tool and your support.

I'm using barman (v 1.3.0) in a test system running on opensuse 13.1.
The backup is working fine, but the recovery is not working:

barman.log:
2014-03-11 08:40:58,106 barman.backup INFO: Starting local restore for server main using backup 20140311T083926
2014-03-11 08:40:58,106 barman.backup INFO: Destination directory: /data/hdd/databases/test_restore/
2014-03-11 08:40:58,106 barman.backup INFO: Copying the base backup.
2014-03-11 08:40:58,258 barman.cli ERROR: Unhandled exception. See log file for more details.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/barman/cli.py", line 554, in main
    p.dispatch(pre_call=global_config, output_file=_output_stream)
  File "/usr/lib/python2.7/site-packages/argh/helpers.py", line 47, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/argh/dispatching.py", line 121, in dispatch
    for line in lines:
  File "/usr/lib/python2.7/site-packages/argh/dispatching.py", line 197, in _execute_command
    for line in result:
  File "/usr/lib/python2.7/site-packages/argh/dispatching.py", line 185, in _call
    for line in result:
  File "/usr/lib/python2.7/site-packages/barman/cli.py", line 269, in recover
    remote_command=args.remote_ssh_command
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 380, in recover
    self.recover_basebackup_copy(backup, dest, remote_command)
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 745, in recover_basebackup_copy
    raise Exception("ERROR: data transfer failure")
Exception: ERROR: data transfer failure


I guess there is something wrong with permissions for the recovery path?
1. Is the test_restore directory created automatically or do i need to create it?
2. Do i need to be careful about the path and permissions?

Thanks a lot for your help,

Gaby


Giulio Calacoci

unread,
Mar 12, 2014, 6:48:07 AM3/12/14
to pgba...@googlegroups.com
Hi Gaby,

The error "ERROR: data transfer failure" means that rsync returned a value different from 0, usually this means,
in the case of a local recovery, that rsync is not able to create required directories.

As you guessed, you need to create the destination directory of a restore and you need to be careful about path and permissions.

Keep in mind that in the past, due to a bug in Suse's rsync package, barman was not able to perform correctly backups and recovery operations,
this does not seems to be your situation but i suggest you to double check your backups.

Regards

Giulio
--
--
You received this message because you are subscribed to the "Barman for PostgreSQL" group.
To post to this group, send email to pgba...@googlegroups.com
To unsubscribe from this group, send email to
pgbarman+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/pgbarman?hl=en?hl=en-GB

---
You received this message because you are subscribed to the Google Groups "Barman, Backup and Recovery Manager for PostgreSQL" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pgbarman+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


-- 
 Giulio Calacoci - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 giulio....@2ndQuadrant.it | www.2ndQuadrant.it 

gap

unread,
Mar 28, 2014, 5:42:51 AM3/28/14
to pgba...@googlegroups.com
Hi,
thanks for your answer and tipps.

How can check that the backup was correctly done?

barman list-backup main shows me the backups I've done and it seeams to be alright:

 main 20140327T152629 - Thu Mar 27 15:26:35 2014 - Size: 4.5 MiB - WAL Size: 0 B (I guess Wal size is 0, because its a test server and nothings happening on it.)

I looked in the backup.info file of the backup I wanted to use for the test restore and get the following:

begin_offset=32
begin_time=2014-03-27 15:26:29.686138
begin_wal=000000010000006400000010
begin_xlog=64/10000020
config_file=/data/hdd/databases/pgsql/data/postgresql.conf
end_offset=224
end_time=2014-03-27 15:26:35.988701
end_wal=000000010000006400000010
end_xlog=64/100000E0
error=None
hba_file=/data/hdd/databases/pgsql/data/pg_hba.conf
ident_file=/data/hdd/databases/pgsql/data/pg_ident.conf
mode=default
pgdata=/data/hdd/databases/pgsql/data
server_name=main
size=0
status=DONE
tablespaces=None
timeline=1
version=90206

Which permissions should be set on the restore directory? I tried both, postgres and barman as owners.

Thanks for any suggestions!
Kind regards,
Gaby

Giulio Calacoci

unread,
Mar 28, 2014, 8:33:13 AM3/28/14
to pgba...@googlegroups.com
Hi Gaby,

A command for backup validation is something that we are planning to add in the future, and is in our TODO list.
Regarding your guess about wall size zero i suggest you to:

configure barman cron for running in the background at least once a minute

this will force barman to archive logs, then

$ barman show-backup <server_name> <backup_name>

checking the for the Base backup information section, WAL number should be > 0.
You can also force Postgres to archive the current WAL using "SELECT pg_switch_xlog()" via psql.

Reading your previous email about the "ERROR: data transfer failure": this error means that rsync command failed.
Barman user should be able to write and create directories in your recovery destination directory using rsync, actually I can't say
if your error is generated by a lack of user's permissions  or it's a problem related to the rsync bug on suse.

To troubleshoot your issue I suggest you to set "log_level = DEBUG" in your configuration file and perform the recovery, then you will see
additional information about what is failing searching for the rsync command in the logfile.

Regards
Giulio

gap

unread,
Mar 28, 2014, 8:44:52 AM3/28/14
to pgba...@googlegroups.com
Hi,

thanks.
I gave all permissions, so this shouldn't be the problem.

But i found this:

rsync: change_dir \"/data/hdd/databases/backup/barman/main/base/20140327T152629/pgdata\" failed: No such file or directory (2)\n", 4096) = 125
poll([{fd=8, events=POLLIN|POLLPRI}, {fd=6, events=POLLIN|POLLPRI}], 2, 4294967295) = 1 ([{fd=8, revents=POLLIN}])
read(8, "rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0]\n", 4096) = 114

I was wondering that in the base backup folder the only file was the backup.info file and no file system data.

So it seems the rsync bug is causing the error. How to solve this? Another version of rsync?

Thanks and regards,
Gaby

Giulio Calacoci

unread,
Mar 28, 2014, 2:12:54 PM3/28/14
to pgba...@googlegroups.com
Hi Gaby,

Yes that is the problem i was talking about.

The problem is related to the fact that rsync is compiled with SLP support using an unofficial patch.
Barman builds rsync paths using ':'

example: rsync -e 'ssh user@host' :/path1 :/path2 destination/

usually the above comand means: copy /path1 and /path2 from the host specified using the -e option to destination/

The unofficial patch breaks this behavior, so rsync tries a lookup on the SLP service instead of reading /path2 from the same host.
(the wrong code also exits without any error message, and that's why everything seems ok)
And this is a suse only behavior.

I think you can try to compile rsync from sources, without including any support or patch for SLP. This should solve your problem.

I'll open a bug report about this for opensuse developers.

Thank you for your help in discovering this bug.

Regards
Giulio

gap

unread,
Mar 31, 2014, 8:34:05 AM3/31/14
to pgba...@googlegroups.com
Hi Giulio,

thanks for your help and description.
With some help of a colleague, I rebuilt the rsync package without SLP support.

Unfortunately, its still not working. But now the error message erases already when I try to backup my main server and tells me that it didn't work (which is better than before):

barman.log:
2014-03-31 13:56:09,104 barman.backup INFO: Backup start at xlog location: 64/1B000020 (00000001000000640000001B, 00000020)
2014-03-31 13:56:09,104 barman.backup INFO: Copying files.
2014-03-31 13:56:09,106 barman.backup ERROR: ERROR: data transfer failure
None
2014-03-31 13:56:10,125 barman.backup ERROR: Backup failed copying files

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 230, in backup
    backup_size = self.backup_copy(backup_info)
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 654, in backup_copy
    raise Exception(msg)

Exception: ERROR: data transfer failure

Do you have any further suggestions for this suse-related problems?

Thanks,
Gaby

Giulio Calacoci

unread,
Mar 31, 2014, 3:30:34 PM3/31/14
to pgba...@googlegroups.com
Hi,

Sorry for the delay in answering you.
I'm happy to see that at least something changed, and now we have an error.

Actually I'm working on error reporting and exception management for barman 1.3.1
until the release of the new version (14 of april), I have to ask if you please could activate the debug mode
in the configuration file using "log_level = DEBUG", and perform a backup, you will see in the log additional information
about rsync failing comand.

Regards
Giulio

gap

unread,
Apr 1, 2014, 3:27:54 AM4/1/14
to pgba...@googlegroups.com
Hi,

this is the result when setting log_level = DEBUG:

2014-04-01 09:20:32,769 barman.command_wrappers DEBUG: Command: ['rsync', '-e', "ssh 'postgres@localhost' '-o' 'BatchMode=yes' '-o' 'StrictHostKeyChecking=no'", '-rLKpts', '--delete-excluded', '--inplace', '--exclude=/pg_xlog/*', '--exclude=/pg_log/*', '--exclude=/postmaster.pid', ':/data/hdd/databases/pgsql/data/', '/data/hdd/databases/backup/barman/main/base/20140401T092032/pgdata']
2014-04-01 09:20:32,771 barman.command_wrappers DEBUG: Command return code: 1
2014-04-01 09:20:32,771 barman.command_wrappers DEBUG: Command stdout: No SLP support, cannot browse

2014-04-01 09:20:32,772 barman.command_wrappers DEBUG: Command stderr: rsync error: syntax or usage error (code 1) at main.c(1257) [Receiver=3.1.0]

2014-04-01 09:20:32,772 barman.backup ERROR: ERROR: data transfer failure
None
2014-04-01 09:20:33,999 barman.backup ERROR: Backup failed copying files

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 230, in backup
    backup_size = self.backup_copy(backup_info)
  File "/usr/lib/python2.7/site-packages/barman/backup.py", line 654, in backup_copy
    raise Exception(msg)
Exception: ERROR: data transfer failure

Regards,
Gaby
 giulio.calacoci@2ndQuadrant.it | www.2ndQuadrant.it 

Giulio Calacoci

unread,
Apr 1, 2014, 7:13:16 AM4/1/14
to pgba...@googlegroups.com
Hi,

I will investigate on this,

so far i can see this:


"2014-04-01 09:20:32,771 barman.command_wrappers DEBUG: Command stdout: No SLP support, cannot browse"

slp support is still there.

did you build your package from official sources from rsync site?

regards

Giulio

gap

unread,
Apr 1, 2014, 9:26:09 AM4/1/14
to pgba...@googlegroups.com
Hi Giulio,

so finally I solved the problem by excluding one patch when building the rsync package.
First I just disabled the slp option but it was necessary to exclude the slp.diff patch too.

I will test the same on SLES 11 SP3 because its our production system. If the same error occures, I'll open a bug report.

Thanks for your help!

Regards,
gaby
-- 
 Giulio Calacoci - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 giulio....@2ndQuadrant.it | <a moz-do-not-send="true" href="http://www.2ndQuadrant.it" target="_blank" 
...

Giulio Calacoci

unread,
Apr 1, 2014, 10:37:35 AM4/1/14
to pgba...@googlegroups.com
Hi Gaby,

I'm happy you solved your problem,
and I've to admit that i was curious to discover why barman was not working on suse.

Now we also know a workaround for this issue.

I was also planning to open a bug report on for Suse developers too, one your issue was solved.

Good luck and keep me posted with your production server installation.

Regards
Giulio

gap

unread,
Apr 3, 2014, 7:51:03 AM4/3/14
to pgba...@googlegroups.com
Hi,

the same rsync issue occurred for SLES 11 SP3, but rebuilding the rsync package without slp support solved the problem and everything is working fine.

Regards,
Gaby
-- 
 Giulio Calacoci - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 giulio....@2ndQuadrant.it | <a moz-do-not-send="true" href="http://www.2ndQuadrant.it" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.2ndQuadrant.it\46sa\75D\46sntz\0751\46usg\75AFQjCNGUY3DIMSzpB6Uy_lY9KZMurQuX5g';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.2ndQuadrant.it\46sa\75D\46sntz\0751\46usg\75AFQjCN
...

Eurides Baptistella

unread,
May 13, 2014, 8:27:21 AM5/13/14
to pgba...@googlegroups.com

gap

unread,
May 16, 2014, 4:41:04 AM5/16/14
to pgba...@googlegroups.com
Hi,

thanks for your hint. I tried your solution first when I found the thread, but it didn't work for me, so I had to rebuilt the packages.

Regards,
gaby
Reply all
Reply to author
Forward
0 new messages