[Bug 8177] Problems with big sparsed files

samba...@samba.org

unread,

May 28, 2011, 9:43:30 PM5/28/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #1 from jolui...@gmail.com 2011-05-29 01:43:29 UTC ---
More information:
I did compile the rsync 3.0.8 version, statically, in centos 3.9 32 bits,
because this is the platform to build utilities for esxi 4.1.

--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

samba...@samba.org

unread,

May 28, 2011, 11:18:23 PM5/28/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #2 from jolui...@gmail.com 2011-05-29 03:18:22 UTC ---
I did download the git'ed version (3.1.0), and the problem continue.

samba...@samba.org

unread,

May 29, 2011, 12:46:15 AM5/29/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #3 from jolui...@gmail.com 2011-05-29 04:46:15 UTC ---
The size of file is 8gb (not 36gb).
Without parameters --inplace and --sparse, I did check, in the destination
disk, the temporary file (.cata-920-flat.vmdk.WKiDRx), increase up to 8gb, and,
then, I see the rsync error in screen, and the temporary file is deleted.

samba...@samba.org

unread,

May 30, 2011, 3:12:21 PM5/30/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #4 from jolui...@gmail.com 2011-05-30 19:12:21 UTC ---
Hi, forum!
Anyone out there?

samba...@samba.org

unread,

Jun 18, 2011, 1:41:13 PM6/18/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

Wayne Davison <way...@samba.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED

--- Comment #5 from Wayne Davison <way...@samba.org> 2011-06-18 17:41:13 UTC ---
Any time there is a read error, rsync tweaks the checksum so that it doesn't
match, thus alerting the receiver that the file is bogus. You need to figure
out what is causing this error from your OS:

rsync: read errors mapping
"/vmfs/volumes/cata-021-lun0/cata-920/cata-920-flat.vmdk": Invalid argument
(22)

That error is reported at the end of all the reading, and indicates that one of
the read() calls for the sender's file returned that errno. If you strace the
sending side, you should be able to see what the parameters are to the read()
that trigger that OS error.

samba...@samba.org

unread,

Jul 7, 2011, 11:38:30 AM7/7/11

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #6 from jolui...@gmail.com 2011-07-07 15:38:30 UTC ---
Excuse me by the delay, and thanks for your reply.
The problem are:
* I need use the rsync (statically compiled) to rsync files between a esxi-free
4.1 (vmware) and a linux filesystem.
* Then, I did compile the rsync from sources, in a rhel platform (appropiate to
generate binaries for this version of vmware), and transfer it to esxi host.
* In esxi-free 4.1 (the sender side), we don't have tools to debug.
* I don't have experience with sparse files in other platforms, to test it.

Can you give me a little help to modify the source to add debug information in
the syscall?. I did try it, but, because the size of file is very large, and I
see a lot of debug information.

(In reply to comment #5)

samba...@samba.org

unread,

Jan 18, 2012, 5:51:41 AM1/18/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #7 from jolui...@gmail.com 2012-01-18 10:51:39 UTC ---
Hi!
Any news with this "bug"?
Can I help with any thing?

TIA
jorge infante

samba...@samba.org

unread,

Mar 1, 2012, 5:08:29 AM3/1/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #8 from Rick <ri...@ict-concept.nl> 2012-03-01 10:08:28 UTC ---
Hi Jorge,
You're not alone :)

I'm using rsync the backup some vmware servers. Small files work without any
problems but the large (-flat.vmdk) file only works the first time. (I do
create a snapshot before syncing)

This is the log:
--------------------------
Sync the -flat.vmdk file (Vcenter Server-flat.vmdk)
rsync: read errors mapping
"/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Vcenter Server/Vcenter
Server-flat.vmdk": Invalid argument (22)
ERROR: Vcenter Server-flat.vmdk failed verification -- update discarded.
rsync: read errors mapping
"/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Vcenter Server/Vcenter
Server-flat.vmdk": Invalid argument (22)
rsync error: some files/attrs were not transferred (see previous errors) (code
23) at main.c(1518) [generator=3.0.9]
Thu Mar 1 10:43:00 CET 2012 FAILED: rsync on Vcenter Server failed (vmdk-flat
file) (0)
--------------------------

Some details:
The both systems are running VMware ESX 4.1.0 build-260247 (Red Hat 5.1).
The filesystem is ext3.
I tried both rsync 3.0.7 and 3.0.9.
The filesize is 80G / 200G (both ESX servers make copies of the other ESX
server. Both ESX servers give the same error)
Removing the --sparse parameter doesn't work

The rsync command used is: /usr/local/bin/rsync -t -e "ssh -i
/operator/vmbackup/vm_image_key -l root" --rsync-path=/operator/rsync --sparse
--stats
"192.168.1.10:/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Server02/Server02-flat.vmdk"
"/vmfs/volumes/VM-Datastore-VMHOST02//Server02/"

The backup script runs at multiple customers (some are also ESX 4.1 machines),
but this is the only one giving an error.

samba...@samba.org

unread,

Mar 10, 2012, 3:36:01 PM3/10/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #9 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-10 20:36:00 UTC ---
Created attachment 7373
--> https://bugzilla.samba.org/attachment.cgi?id=7373
strace dumps

samba...@samba.org

unread,

Mar 10, 2012, 3:36:30 PM3/10/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #10 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-10 20:36:28 UTC ---
Created attachment 7374
--> https://bugzilla.samba.org/attachment.cgi?id=7374

samba...@samba.org

unread,

Mar 10, 2012, 3:37:35 PM3/10/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #11 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-10 20:37:32 UTC ---
I am having the same issue with ESXi5.0 and rsync 3.0.9. Compiled rsync on a
debian 4.0 system.
I have a VM and make a snapshot. at that point the original disk file should
not change anymore.
When the problem does not occur:
*) when the VM is not running
*) when the image file does not exists on the target

when the problem does occur:
*) when the VM is running and the file already exists on the target.

I attached a tgz with straces of the following command:
strace -v -tt -f -o out.log ./rsync --copy-links --recursive --compress
--progress --inplace --bwlimit=100
/vmfs/volumes/4e9007cc-5eb44d46-eace-0019b9e8ff4d/backups/
user@host:/targetdir/
in all 4 cases (running/not running and existing/non existing target file)

but it comes down to this:
--- VM off, file already exists on target machine ---
7754511 20:20:40.298664 open("backuptest/backuptest_7-flat.vmdk",
O_RDONLY|O_LARGEFILE) = 3
7754511 20:20:40.300134 fstat64(0x3, 0xfff6cacc) = 0
7754511 20:20:40.300296 write(1, "backuptest/backuptest_7-flat.vmd"..., 34) =
34
7754511 20:20:40.317672 mmap2(NULL, 266240, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x782f0000
7754511 20:20:40.317831 read(3,
"\25\0\2161\246\210wN\253Y\332O`T\357\354\21\307\344\31"..., 262144) = 262144
7754511 20:20:40.341102 gettimeofday({1331410840, 341167}, NULL) = 0
7754511 20:20:40.341230 ioctl(1, TIOCGPGRP, [7748366]) = 0
7754511 20:20:40.341434 write(1, " 4096 0% 13.38kB/s "..., 41) =
41
-- bunch of gettimeofday()'s --
7754511 20:20:40.382444 read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262143) = 262143

--- VM on, file already exists on target machine ---
7672734 20:24:08.828084 open("backuptest/backuptest_7-flat.vmdk",
O_RDONLY|O_LARGEFILE) = 3
7672734 20:24:08.828808 fstat64(0x3, 0xfff6cacc) = 0
7672734 20:24:08.828976 write(1, "backuptest/backuptest_7-flat.vmd"..., 34) =
34
7672734 20:24:08.829380 mmap2(NULL, 266240, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x782f0000
7672734 20:24:08.829529 read(3,
"\25\0\2161\246\210wN\253Y\332O`T\357\354\21\307\344\31"..., 262144) = 262144
7672734 20:24:08.831043 gettimeofday({1331411048, 831103}, NULL) = 0
7672734 20:24:08.831165 ioctl(1, TIOCGPGRP, [7748509]) = 0
7672734 20:24:08.831367 write(1, " 4096 0% 15.09kB/s "..., 41) =
41
-- bunch of gettimeofdays()'s --
7672734 20:24:08.847043 read(3, 0x782f0009, 262143) = -1 EINVAL (Invalid
argument)

I am kinda lost at this point.

Remco

samba...@samba.org

unread,

Mar 10, 2012, 5:47:51 PM3/10/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #12 from grar...@gmail.com 2012-03-10 22:47:50 UTC ---
As an aside, be aware that you may encounter rather poor
performance with sparse files. If so, and until this bug
is addressed, you may need to look at other transfer options.
https://bugzilla.samba.org/show_bug.cgi?id=7854

You may also want to backup a snapshot (maybe one application
level write point of fail) instead of missing all the writes
numbered 2+ that will slip by while waiting on those hundred
GB to copy. More properly, shut the vm down first or use any
built in snapshot tool the vm has.

Also, check *.* syslog and dmesg for any errors, run memtest86.
But it say mmap 22, so maybe not disk media error.

samba...@samba.org

unread,

Mar 11, 2012, 4:08:24 AM3/11/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #13 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-11 08:08:22 UTC ---
The reason for me to use rsync is the following:
I want to backup a remote server over the internet. so bandwidth usage is an
issue. The backup target is a btrfs system (again, not the cause of the
problem, happens with ext3 on the target too). I make a reflink copy of
yesterdays file and do rsync --inplace to backup. uses very little bandwidth
and diskspace. The time it takes for the rsync to take place is not an issue.
The snapshot is made and does not change over time until delete it after rsync
is done.

i did some more testing and found out the problem is not related to the file
being sparse or big. when i have a non-sparse 16MB file, the problem persists.

now im not an expert in interpreting strace output, but i dont see much
difference. but does strace display things differently depending on the return
value of a call?

3978200 20:24:54.964491 read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144

7672734 20:24:08.847043 read(3, 0x782f0009, 262143) = -1 EINVAL (Invalid
argument)

what is that 0x782f0009 ? the address rsync wants read to put the data?

Remco

samba...@samba.org

unread,

Mar 11, 2012, 6:38:03 AM3/11/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #14 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-11 10:38:02 UTC ---
Is it possible the issue in the non-blocksized-read ?
the first read() is always 262144 bytes.
when the target does not exist yet, all other read()'s are 262144 bytes too,
however, when the target file does exist, the next read()'s are 1 byte less:
262143 bytes.

Why this only matters when the VM is running i do not know. Must be an internal
VMFS issue.

samba...@samba.org

unread,

Mar 11, 2012, 6:52:56 AM3/11/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #15 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-11 10:52:55 UTC ---
confirmed. the following text program demonstrates the problem:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(void) {
int fd, size;
char data[300000];

fd =
open("/vmfs/volumes/4e9007cc-5eb44d46-eace-0019b9e8ff4d/backups/backuptest/backuptest_7-flat.vmdk",O_RDONLY);
printf("open() : %d\n",fd);

size = read(fd, data, 262144);
printf("read() : %d\n",size);

size = read(fd, data, 262143);
printf("read() : %d\n",size);

close(fd);
}

the first read() returns 262144, as it should. the 2nd returns -1 on VMFS when
the VM is running.
The required block-alignment is a multiple of 512 bytes.

samba...@samba.org

unread,

Mar 11, 2012, 4:11:47 PM3/11/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #16 from Remco Hosman <re...@hosman.xs4all.nl> 2012-03-11 20:11:46 UTC ---
Someone pointed out i forgot to test the error returned by the failed reads. so
here they are.

after a failed read errno is 22, meaning EINVAL (also the same numerical error
rsync reports).

I also found that lseek()'ing to a non-512 byte boundary makes read()'s of any
size fail until you seek to a 512-byte boundary again.

samba...@samba.org

unread,

Oct 19, 2012, 8:49:15 AM10/19/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #17 from molniev <mol...@hotmail.com> 2012-10-19 12:49:07 UTC ---
Hi.
I'am using rsync 3.0.9 on ESXi 5.0 and detect this bug (thanks Remco Hosman).
I made a patch that solves the problem. This code reads the files, aligned
with the 4096 border. The size of the blocks in (4096 * 4096 * 2) bytes (I
think it will be faster, but you sacrifice memory).

Replace the file "fileio.c" function "map_ptr" (line 180-264) with the
following code:

char *map_ptr(struct map_struct *map, OFF_T offset, int32 len)
{
OFF_T ret = 0;
int32 read_size = 0;
int32 n = 0;

if (0 == len) return NULL;
if (len < 0) { rprintf(FERROR, "invalid len passed to map_ptr: %ld\n",
(long)len); exit_cleanup(RERR_FILEIO); }

if ( (offset >= map->p_offset) && (offset + len <= map->p_offset +
map->p_len) )
return map->p + (offset - map->p_offset);

map->p_offset = (offset >> 12);
map->p_offset = (map->p_offset << 12);
while (map->p_len < (len + 4096) )
map->p_len += (4096 * 4096 * 2);

if (map->p_len > map->p_size) {
map->p_size = map->p_len;
map->p = realloc_array(map->p, char, map->p_size);
if (!map->p) out_of_memory("map_ptr");
}

if (map->p_fd_offset != map->p_offset) {
ret = do_lseek(map->fd, map->p_offset, SEEK_SET);
if (ret != map->p_offset) {
rsyserr(FERROR, errno, "lseek returned %.0f, not %.0f",
(double)ret, (double)map->p_offset);
exit_cleanup(RERR_FILEIO);
}
map->p_fd_offset = map->p_offset;
}

if (map->p_offset + map->p_len > map->file_size) {
if (map->p_offset > map->file_size) {
map->status = ENODATA;
printf("DBG path function map_ptr error: %i\n", (int)__LINE__);
memset(map->p + (offset - map->p_offset), 0, len);
return (map->p + (offset - map->p_offset) );
}
map->p_len = map->file_size - map->p_offset;
if (len > (map->p_len - (offset - map->p_offset) ) ) {
map->status = ENODATA;
printf("DBG path function map_ptr error: %i\n", (int)__LINE__);
memset(map->p + map->p_len, 0, map->p_len - len);
}
}

read_size = map->p_len;
while (read_size > 0) {
n = read(map->fd, map->p, read_size);
if (n <= 0) {
if (!map->status) map->status = n ? errno : ENODATA;
printf("DBG path function map_ptr error: %i\n", (int)__LINE__);
memset(map->p, 0, map->p_len);
break;
}
map->p_fd_offset += n;
read_size -= n;
}

return map->p + (map->p_offset - offset);
}

If you find a bug in my code - please let - I'll try to fix it.

samba...@samba.org

unread,

Oct 19, 2012, 8:57:55 AM10/19/12

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #18 from molniev <mol...@hotmail.com> 2012-10-19 12:57:52 UTC ---
Created attachment 8090
--> https://bugzilla.samba.org/attachment.cgi?id=8090
Correct fileio.c

samba...@samba.org

unread,

Jan 2, 2013, 12:49:40 PM1/2/13

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #19 from ciscolyon <sup...@fidit.com> 2013-01-02 17:49:39 UTC ---
(In reply to comment #18)
> Created attachment 8090 [details]
> Correct fileio.c

where can I get a Rsync binary for esxi 5.1

Thank you

samba...@samba.org

unread,

Jan 3, 2013, 1:54:23 PM1/3/13

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #20 from molniev <mol...@hotmail.com> 2013-01-03 18:54:22 UTC ---
(In reply to comment #19)

> (In reply to comment #18)

> > Created attachment 8090 [details] [details]

> > Correct fileio.c
>
> where can I get a Rsync binary for esxi 5.1
>
> Thank you

Use article:
http://www.virtuallyghetto.com/2011/02/how-to-compile-statically-linked-rsync.html

For 5.1 compatible not only Centos 3.9. Use newest distributive (4.x or 5.x).

samba...@samba.org

unread,

Jan 16, 2013, 1:05:05 PM1/16/13

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

--- Comment #21 from ciscolyon <sup...@fidit.com> 2013-01-16 18:04:57 UTC ---

I find the binay on

http://www.virtuallyghetto.com/2011/02/how-to-compile-statically-linked-rsync.html
and it works

samba...@samba.org

unread,

Jan 18, 2013, 6:38:59 PM1/18/13

to

https://bugzilla.samba.org/show_bug.cgi?id=8177

Wayne Davison <way...@samba.org> changed:

What |Removed |Added
----------------------------------------------------------------------------

Status|ASSIGNED |RESOLVED
Resolution| |FIXED

--- Comment #22 from Wayne Davison <way...@samba.org> 2013-01-18 23:38:52 UTC ---
I have checked in a change that makes the map_ptr() function align its reads on
1024-byte boundaries. Hopefully that will fix this issue without making things
inefficient.

I looked at the patch from molniev, and it has a bug in the read look should
the read return less than the number of bytes requested (the next read would
overwrite earlier bytes instead of appending them). My patch is instead just a
simpler bit of rounding changes made to the stock code.