Experience transferring large files between iRODS zones?

218 views
Skip to first unread message

Robert Verkerk

unread,
Aug 24, 2022, 5:11:30 AM8/24/22
to irod-chat
Hi,

We are trying to transfer a large file (1.2TB) from zone A (Netherlands) to zone B (Spain). Started the transfer on Monday 2022-08-23 at 13:45. It is still busy.
If it stayed performing at 1Gb/s it would have been finished with the transfer within 3 hours.

The command used to transfer the file is a s follows:
$ date ; irsync -V i:test_1.2T.tar.gz i:/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz -R compbiomed ; date
Mon Aug 22 13:45:47 CEST 2022

We see that it starts with a transfer speed of about 1Gb/s to zone B. At this time it uses 4 threads to transfer the data. We can see that with a "netstat -an" and grep on the receiving IP address.
After 1 hour and 40 minutes it drops to about 40Mb/s. And then it only uses 2 threads.


Has anybody any experience with the transfer of such big files with iRODS between zones?

We know the mtu size at zone A is 9000. We know the firewall and the network at zone A do NOT CAP on purpose. There is nothing setup on te firewall or the network to implement this.
We know the mtu size at zone B is 1500.

Is there something in iRODS we can set to troubleshoot this? iRODS 4.2.10 on both zones.

Greetings,

Robert Verkerk




John Constable

unread,
Aug 24, 2022, 10:28:46 AM8/24/22
to iRODS-Chat
A couple of thoughts;

Are you hitting IO issues on the sender or receiver perhaps? What does `iotop` say on the appropriate file systems?

Terrell Russell

unread,
Aug 24, 2022, 12:49:48 PM8/24/22
to irod...@googlegroups.com
Very curious.

I notice that 1h40m is 100 minutes (a nice round number).

I also have never seen the number of threads drop ("And then it only uses 2 threads"), but the transfer continue to make progress.

Is there anything of interest in the server logs on either end?  Any errors?

The transfer eventually completes successfully?  Just throttled along the way?

Terrell




--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/irod-chat/aedde199-fa81-4a7c-9572-ded92d09c404n%40googlegroups.com.

Tony Edgin

unread,
Aug 24, 2022, 2:56:07 PM8/24/22
to irod...@googlegroups.com
Hi Robert.

Are you transferring data into an HPC system?  I've seen this problem when doing that. We were using iget in an HPC environment to download a large file, but we weren't on a data transfer node.  The iget call used parallel transfer due to the size of the file. The transfer threads were scheduled to run on different compute nodes, which confused iget. Over time the transfer threads died off, resulting in the throughput decreasing to zero. We switched to running iget on a data transfer node, and this solved our problem. 

Cheers,
Tony

Robert Verkerk

unread,
Aug 25, 2022, 7:07:55 AM8/25/22
to irod-chat
HI,

Thanks for your thoughts and replies.

I will set the timeout parameter on my side. At the moment it is the default of 7200. And ask the other zone to do the same.

We connect to one machine on the remote zone. That is were the remote resource is. It is a nfs connected filesystem on the iCAT server. But for the iCAT server it is just a filesystem. 

At the end we see some errors:

$ date ; irsync -V i:test_1.2T.tar.gz i:/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz -R compbiomed ; date
Mon Aug 22 13:45:47 CEST 2022
remote addresses: 145.100.2.197 ERROR: rsyncDataToDataUtil - Failed to copy the object "/vzSARA2/home/rods/test_1.2T.tar.gz" to "/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz" SYS_COPY_LEN_ERR
remote addresses: 145.100.2.197 ERROR: rsyncUtil: rsync error for /bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz status = -27000 SYS_COPY_LEN_ERR
Wed Aug 24 15:20:29 CEST 2022

And the logfile shows:

$ grep -vi -e trimmed -e purge rodsLog.2022.08.22
Aug 24 15:20:29 pid:91752 NOTICE: _partialDataPut:Bytes written 2071868 don't match read 2447360
Aug 24 15:20:29 pid:91752 WARNING: No replica access token in L1 descriptor. Ignoring replica access table. [path=/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz, resource_hierarchy=]
Aug 24 15:20:29 pid:91752 remote addresses: 127.0.0.1, 145.x.y.z ERROR: [-] /irods/lib/core/src/procApiRequest.cpp:162:int sendApiRequest(rcComm_t *, int, void *, bytesBuf_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Connection timed out] -- message [failed to call 'write body']
        [-]     /irods/lib/core/src/sockComm.cpp:1285:irods::error sendRodsMsg(irods::network_object_ptr, const char *, bytesBuf_t *, bytesBuf_t *, bytesBuf_t *, int, irodsProt_t) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Connection timed out] -- message [failed to call 'write body']
                [-]     /irods/plugins/network/tcp/libtcp.cpp:355:irods::error tcp_send_rods_msg(irods::plugin_context &, const char *, bytesBuf_t *, bytesBuf_t *, bytesBuf_t *, int, irodsProt_t) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Connection timed out] -- message [writeMsgHeader failed]
                        [-]     /irods/lib/core/src/sockComm.cpp:503:irods::error writeMsgHeader(irods::network_object_ptr, msgHeader_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Connection timed out] -- message [wrote 0 expected -1946157056]
                                [-]     /irods/plugins/network/tcp/libtcp.cpp:276:irods::error tcp_write_msg_header(irods::plugin_context &, bytesBuf_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Connection timed out] -- message [wrote 0 expected -1946157056]

Aug 24 15:20:29 pid:91752 remote addresses: 127.0.0.1, 145.100.2.197 ERROR: [-] /irods/lib/core/src/procApiRequest.cpp:250:int readAndProcApiReply(rcComm_t *, int, void **, bytesBuf_t *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
        [-]     /irods/lib/core/src/sockComm.cpp:197:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
                [-]     /irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [only read [0] of [4]]

Aug 24 15:20:29 pid:91752 remote addresses: 127.0.0.1, 145.x.y.z ERROR: [operator():232] - failed closing [/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz] with status [-4000]
Aug 24 15:20:29 pid:91752 remote addresses: 127.0.0.1, 145.x.y.z ERROR: [-] /irods/lib/core/src/rcConnect.cpp:272:int rcDisconnect(rcComm_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Broken pipe] -- message [failed to call 'write body']
        [-]     /irods/lib/core/src/sockComm.cpp:1285:irods::error sendRodsMsg(irods::network_object_ptr, const char *, bytesBuf_t *, bytesBuf_t *, bytesBuf_t *, int, irodsProt_t) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Broken pipe] -- message [failed to call 'write body']
                [-]     /irods/plugins/network/tcp/libtcp.cpp:355:irods::error tcp_send_rods_msg(irods::plugin_context &, const char *, bytesBuf_t *, bytesBuf_t *, bytesBuf_t *, int, irodsProt_t) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Broken pipe] -- message [writeMsgHeader failed]
                        [-]     /irods/lib/core/src/sockComm.cpp:503:irods::error writeMsgHeader(irods::network_object_ptr, msgHeader_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Broken pipe] -- message [wrote 0 expected -1946157056]
                                [-]     /irods/plugins/network/tcp/libtcp.cpp:276:irods::error tcp_write_msg_header(irods::plugin_context &, bytesBuf_t *) :  status [SYS_HEADER_WRITE_LEN_ERR]  errno [Broken pipe] -- message [wrote 0 expected -1946157056]


But the file is complete at the other site/zone. The file size is the same on both sides at the OS level.

I will set the parameter and try again.

Thanks.

Robert Verkerk




Robert Verkerk

unread,
Aug 25, 2022, 10:19:17 AM8/25/22
to irod-chat
Hi,

I set the parameter to 600 and tested again. I also did a tcpdump on the port 1247 for the remote zone. My findings are as follows:

date/time
* 15:26:00 start irsync command
* 15:36:02 first check for keepalive on port 1247
* 15:37:17 2nd check
* 15:38:33 3th check
* 15:39:48 4th check
* 15:41:03 5th check
* 15:42:18 6th check
* 15:43:34 7th check
* 15:44:49 8th check
* 15:46:04 9th check
* 15:47:20 disconnect it seems

And after 15:47 the process to port 1247 was gone.

And these are the settings on my side:

tcp_keepalive_time = 600
tcp_keepalive_intvl = 75
tcp_keepalive_probe = 9

The last two entries from tcpdump:
15:46:04.779571 IP 145.x.y.x.36006 > 84.x.y.z.1247: Flags [.], ack 1, win 485, options [nop,nop,TS val 3610210304 ecr 1831232682], length 0
15:46:04.779577 IP 145.x.y.x.36006 > 84.x.y.z.1247: Flags [.], ack 1, win 485, options [nop,nop,TS val 3610210304 ecr 1831232682], length 0
15:47:20.043514 IP 145.x.y.x.36006 > 84.x.y.z.1247: Flags [R.], seq 1, ack 1, win 485, options [nop,nop,TS val 3610285568 ecr 1831232682], length 0
15:47:20.043519 IP 145.x.y.x.36006 > 84.x.y.z.1247: Flags [R.], seq 1, ack 1, win 485, options [nop,nop,TS val 3610285568 ecr 1831232682], length 0

What am I missing? Or doing wrong?

Greetings,

Robert Verkerk

Terrell Russell

unread,
Aug 25, 2022, 11:03:32 PM8/25/22
to irod...@googlegroups.com
We're closing it on it...

Some networking equipment in the middle has a faster timeout than 600.   Change tcp_keepalive_time to 300?  100?

The fact that the entire file made it to the other side is good news - the only thing you're now debugging is why the 1247 connection is getting dropped before it can do the 'finalize' step to get the iRODS bookkeeping correct in the catalog.

Terrell



Robert Verkerk

unread,
Aug 26, 2022, 8:36:50 AM8/26/22
to irod-chat
Hi,

Thanks for the help. We managed to get the file over to BSC.

$ date ; irsync -V i:test_1.2T.tar.gz i:/bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz -R compbiomed ; date
Fri Aug 26 09:49:58 CEST 2022
   test_1.2T.tar.gz           1202137.375 MB | 11535.284 sec | 4 thr | 104.214 MB/s
Fri Aug 26 13:02:15 CEST 2022

$ ils -l /bsc/home/rods#vzSARA2/test_1.2T_v2.tar.gz
  rods              0 compbiomed 1260532359168 2022-08-26.09:50 & test_1.2T_v2.tar.gz

This was with the tcp_keepalive_time set to 150.

Robert Verkerk


Terrell Russell

unread,
Aug 26, 2022, 8:41:31 AM8/26/22
to irod...@googlegroups.com
Excellent.     Thanks for the follow up - this will certainly help the next person.

Terrell



joris luijsterburg

unread,
May 8, 2023, 10:42:20 AM5/8/23
to iRODS-Chat
All,

I encountered a similar situation, but for me it seems to be the database connection. I was sending a 166GB file that failed at the end of the transfer. The file ended up in the resource, it had the correct size, but I got a different error:

status = -818000 status = -818000 CAT_NO_ACCESS_PERMISSION

From the irods log I saw the following entries:

log_messagecllExecSqlWithResultBV: SQLExecDirect error: -1, sql:select object_id from R_OBJT_ACCESS OA, R_DATA_MAIN DM, R_USER_GROUP UG, R_USER_MAIN UM, R_TOKN_MAIN TM where OA.object_id=? and UM.user_name=? and UM.zone_name=? and UM.user_type_name!='rodsgroup' and UM.user_id = UG.user_id and OA.object_id = DM.data_id and UG.group_user_id = OA.user_id and OA.access_type_id >= TM.token_id and TM.token_namespace ='access_type' and TM.token_name = ? limit 1
log_messageSQLSTATE: 08S01 log_messageSQLCODE: 35
log_messageSQL Error message: SSL SYSCALL error: EOF detected ; The connection has been lost log_message[db_check_permission_to_modify_data_object_op:15172] - [user does not have permission to modify object with data id [13406]] log_message[finalize:435] - [user not allowed to modify data object [data id=[13406]]]
So in here irods concludes that I can't modify the metadata because the query to check if I had permission failed.
Now why does the query fail? Our database runs in a different server, which has a loadbalancer in front. The loadbalancer has a 5 minute timeout, and my file transfer took 11 minutes. My provider server tcp_keepalive_time was set at 1800, so it didn't kick in yet. When I set the  tcp_keepalive_time at 100 and retried, the transfer succeeded. So the solution seems to be the same here as in this thread.

My assumption is now that irods will write some information to the database at the start of the file transfer, and when the transfer ends it will write some more data to the database, but it will keep the connection open while doing so instead of reinitiating it at the end of the file transfer. Which means that you indeed have to set the keepalive settings properly.

best regards,

Joris


Kory Draughn

unread,
May 9, 2023, 8:30:17 AM5/9/23
to irod...@googlegroups.com
Hi Joris,

Yes, there are several issues about TCP keepalive in the repository.

We'll make sure to add options for controlling keepalive in 4.3.1.

Thanks,

Kory Draughn
Chief Technologist
iRODS Consortium


Reply all
Reply to author
Forward
0 new messages