irsync failure

125 views
Skip to first unread message

John Knutson

unread,
Jun 15, 2010, 2:52:02 PM6/15/10
to irod...@googlegroups.com
I've been doing some testing with server federation and have run into a
problem. I have one iRODS server where an app had been writing data
into the vault using the rcDataObjWrite API, and on the second server, I
had been periodically (once every 5 minutes) executing irsync to mirror
the collection into which that data had been written.

I now get consistent errors that I am unable to explain. If I use my
original command:
irsync -rvV -R resc1 i:/bZone/archive/raw i:/aZone/archive/adb

the output is:
ERROR: rsyncCollToCollUtil: rsyncDataToDataUtil failed for
/aZone/archive/adb/1016508s.mdp.stat=-27150 status = -27150
SYS_COPY_LEN_ERR, Operation now in progress
ERROR: readMsgHeader:header read- read 0 bytes, expect 4, status = -4002
ERROR: readAndProcApiReply: readMsgHeader error. status = -4002 status =
-4002 SYS_HEADER_READ_LEN_ERR, No such file or directory
ERROR: rsyncCollToCollUtil: rsyncDataToDataUtil failed for
/aZone/archive/adb/1016608s.mdp.stat=-4002 status = -4002
SYS_HEADER_READ_LEN_ERR, No such file or directory
ERROR: rsyncUtil: rsync error for /aZone/archive/adb status = -4002
SYS_HEADER_READ_LEN_ERR, No such file or directory
Client Caught broken pipe signal. Connection to server may be down
NOTICE: writeMsgHeader: wrote 0 bytes, expect 140, status = -5032


If I try and copy the one file specifically, using:
irsync -vV -R resc1 i:/bZone/archive/raw/1016508s.mdp
i:/aZone/archive/adb/1016508s.mdp

the output is just:
ERROR: rsyncUtil: rsync error for /aZone/archive/adb/1016508s.mdp status
= -4150 SYS_HEADER_READ_LEN_ERR, Operation now in progress


I thought there might be corruption on zone "b" but the output of ils at
least shows the same size of file as what is actually on the file
system. I also looked at the permissions on zone "a", and nothing
looked out of the ordinary in either the file system or the zone.

However, when I looked at the file on zone "a", irods is reporting a
size of 33187760 bytes, while the file system is only showing 22859498
bytes. It would appear that an irsync was interrupted and has been
failing ever since. That said, what is the proper way to resolve this?
I'm hoping there's a better way than simply deleting the data object and
redoing the sync.


mw...@diceresearch.org

unread,
Jun 15, 2010, 3:59:58 PM6/15/10
to irod...@googlegroups.com
Hello John,

>ERROR: rsyncUtil: rsync error for /aZone/archive/adb/1016508s.mdp status
>= -4150 SYS_HEADER_READ_LEN_ERR, Operation now in progres

The server may have segfaulted in this case. I think we have fixed it for
the next release. My test system showed:

>$ irsync -v i:xy2 i:xy
>ERROR: rsyncUtil: rsync error for /oneZone/home/rods/xy status = -27000 SYS_COPY_LEN_ERR

We'll need to add a iCommand that is similar to fsck of UNIX (ifsck ?) that will check for
consistency between registered file size and the actual file size and as an option, fix it.

But in the meantime, you may want to mv the physical file and irm the file.

Mike



--
"iRODS: the Integrated Rule-Oriented Data-management System; A community driven, open source, data grid software solution" https://www.irods.org

iROD-Chat: http://groups.google.com/group/iROD-Chat

John Knutson

unread,
Jun 15, 2010, 6:17:22 PM6/15/10
to irod...@googlegroups.com
That's unfortunate, but thanks for the answer.

Another (unrelated) question: is there any way to change the umask of
files and directories created with the unix file system driver such that
local access is possible? Right now all files in the vault are 600, and
directories 700 (well, plus what appears to be something like a sticky
bit, the GNU ls man page doesn't go into that). I'd like for local
users to be able to bypass going through the server to get the data,
instead just using the iRODS server to locate the file(s) on the local
file system containing the desired data. The default permissions make
that impossible :-).

mw...@diceresearch.org

unread,
Jun 15, 2010, 7:14:18 PM6/15/10
to irod...@googlegroups.com
John,

In the file scripts/perl/irodsctl.pl, you can change the default modes by
uncommenting and setting these parameters:

# $DefFileMode - the mode of the file created in the resource vault.
# The default value is 0600 (DEFAULT_FILE_MODE).
# $DefFileMode=0640;

# $DefDirMode - the mode of the directory created in the resource vault.
# The default value is 0750 (DEFAULT_DIR_MODE).
# $DefDirMode=0700;

John Knutson

unread,
Jun 16, 2010, 10:23:46 AM6/16/10
to irod...@googlegroups.com
Looks like it worked for files, but directories are still being created
with the following perms:
drwx--S--- 2 johnk arlut 512 Jun 16 14:09 test/

Which, according to the solaris man page, means:
"Undefined bit-state (the set-user-ID or set-group-id bit is on and the
user or group execution bit is off). For group permissions, this
applies only to non-regular files."

I set $DefFileMode=0644 and $DefDirMode=0755... and poked around in the
code a bit and didn't see anything obvious...

mw...@diceresearch.org wrote:
> John,
>
> In the file scripts/perl/irodsctl.pl <http://irodsctl.pl>, you can

John Knutson

unread,
Jun 16, 2010, 11:51:15 AM6/16/10
to irod...@googlegroups.com
Ok, in that same script file, irodsctl.pl, there's a line just before
"start the server" that does umask(077), which ultimately masks any
non-user mode bits define in the other two variables. I commented out
that umask call and that did the trick.

mw...@diceresearch.org

unread,
Jun 16, 2010, 12:41:19 PM6/16/10
to irod...@googlegroups.com
thanks, John. We'll document that.
Reply all
Reply to author
Forward
0 new messages