For some reason my clients have been experiencing connection losses when
transferring files larger than a few 100MB. A Here is the output from
my systems.
From the client:
[edwin@client ~]$ iput -f -V 1Gfile.dat
NOTICE: irodsHost=irodsserver.iplantcollaborative.org
NOTICE: irodsPort=1247
NOTICE: irodsUserName=edwintest
NOTICE: irodsZone=iplant
NOTICE: created irodsHome=/iplant/home/edwintest
NOTICE: created irodsCwd=/iplant/home/edwintest
From server: NumThreads=8, addr:150.135.x.y, port:20019, cookie=1101858949
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 645696, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 798912, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 388512, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 388512, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 1077984, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: putUtil: put error for /iplant/home/edwintest/1Gfile.dat, status
= -27110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
From the iCAT server (150.135.x.z):
Apr 28 22:40:15 pid:2729 NOTICE: Agent process 4769 started for
puser=edwintest and cuser=edwintest from 10.0.0.137
Apr 28 22:40:15 pid:4769 NOTICE: rsAuthCheck user edwintest#iplant
Apr 28 22:40:15 pid:4769 NOTICE: rsAuthResponse set proxy authFlag to 3,
client authFlag to 3, user:edwintest#iplant proxy:edwintest client:edwintest
Apr 28 22:40:15 pid:2729 NOTICE: Agent process 4771 started for
puser=rodsadmin and cuser=edwintest from 150.135.x.y
Apr 28 22:40:15 pid:4771 NOTICE: rsAuthCheck user rodsadmin#iplant
Apr 28 22:40:15 pid:4771 NOTICE: rsAuthResponse set proxy authFlag to 5,
client authFlag to 3, user:rodsadmin#iplant proxy:rodsadmin client:edwintest
Apr 28 22:40:15 pid:4771 NOTICE: rsAuthCheck user rodsadmin#iplant
Apr 28 22:40:15 pid:4771 NOTICE: readAndProcClientMsg: received
disconnect msg from client
Apr 28 22:40:15 pid:4771 NOTICE: Agent exiting with status = 0
From the non-iCAT enabled server (150.135.x.y):
Apr 28 22:37:20 pid:13366 NOTICE: Agent process 13466 started for
puser=rodsadmin and cuser=edwintest from 150.135.x.z
Apr 28 22:37:21 pid:13466 NOTICE: Warning, cannot authenticate remote
server, serverResponse field is empty
Apr 28 22:37:21 pid:13466 NOTICE: rsAuthResponse set proxy authFlag to
5, client authFlag to 3, user:rodsadmin#iplant proxy:rodsadmin
client:edwintest
I noticed that there is not a corresponding Agent existing from the
non-iCAT enabled server. Does this imply that the agent is crashing?
Has anyone encountered similar problems? I have also restarted both
iRODS servers to see if that would help, but it doesn't. Here are my
specs for my iRODS setup:
iRODS v2.4.1 (both clients and servers)
iCAT server: redhat 5.6
non-iCAT server: Solaris 10 10/09
Interesting enough, the filesystem shows that the file has the same
number of bytes, but the md5sum from the system reports different
hashes. Also, if I ils, then the iRODS reports that the file is there
with the same number of bytes, but if I attempt an iget, I get an
-317000 USER_INPUT_PATH_ERR (path does not exist) error. It appears
that iRODS is left in an inconsistent state.
Any help would be greatly appreciated.
Thank you,
Edwin
I have seen some "connection timed out" issues on a flaky
network. But it looks like that you are doing your testing on your local
site, so that should not be a problem. Anyway, you can try the '-T' option.
As for the USER_INPUT_PATH_ERR error, I do not have much idea. For some
reason, the file path you are writing to does not seem reachable: fs
mount problem, ACL setting on the filesystem tree ?
cheers,
JY
--
"iRODS: the Integrated Rule-Oriented Data-management System; A community driven, open source, data grid software solution" https://www.irods.org
iROD-Chat: http://groups.google.com/group/iROD-Chat
[edwin@client ~]$ ls -laF 32Mb-file.dat
-rw-rw-r-- 1 edwin edwin 34171904 Apr 29 09:20 32Mb-file.dat
[edwin@client ~]$ iput -V -T -f 32Mb-file.dat
NOTICE: irodsHost=irodsserver.iplantcollaborative.org
NOTICE: irodsPort=1247
NOTICE: irodsUserName=edwintest
NOTICE: irodsZone=iplant
NOTICE: created irodsHome=/iplant/home/edwintest
NOTICE: created irodsCwd=/iplant/home/edwintest
From server: NumThreads=2, addr:150.135.x.y, port:20163, cookie=1064427653
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
The client/server socket connection has been renewed
ERROR: rcPartialDataPut: toWrite 4194304, bytesWritten 1658016, errno =
110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
ERROR: putUtil: put error for /iplant/home/edwintest/32Mb-file.dat,
status = -27110 status = -27110 SYS_COPY_LEN_ERR, Connection timed out
As for the filesystem, I can safely iput a file smaller than 32M. Below
is an example of a transfer of a 3M file:
[edwin@client bin]$ ls -laF Xnest
-rwxr-xr-x 1 root root 3758848 Nov 17 16:26 Xnest*
[edwin@client bin]$ iput -V -T -f Xnest
NOTICE: irodsHost=irodsserver.iplantcollaborative.org
NOTICE: irodsPort=1247
NOTICE: irodsUserName=edwintest
NOTICE: irodsZone=iplant
NOTICE: created irodsHome=/iplant/home/edwintest
NOTICE: created irodsCwd=/iplant/home/edwintest
Xnest 3.585 MB | 0.909 sec | 0 thr | 3.943 MB/s
So, given my understanding of iRODS, this transfer actually goes through
my iCAT server to the non-IES server to perform the operation (in this
case, everything goes to the same non-IES). So, I would assume this
would exclude any acl permissions or mounting issues. Is that a correct
assumption?
What is also disconcerting is that I can scp the same ~1G file without
interruption from the same client to the resource/non-IES server within
2 minutes (7.6M/s). AFAIK, scp is also relative sensitive to connection
drops... unless irods is more sensitive (?).
Actually, my client is in the same domain but in a different building on
campus.
Thanks,
Edwin
--
Edwin Skidmore
HRH Manager of Systems and Integrated Services
iPlant Collaborative