Globus Connect Personal transfers (performance tips needed)

99 views
Skip to first unread message

john kennedy

unread,
Sep 12, 2023, 6:27:19 AM9/12/23
to dis...@globus.org
Hi all,

Could anyone point me towards any tuning tips regarding GCP->GCP transfers?

We have a user group that are transferring data to our institute and they see significant differences between transfer between two Globus Connect Personal (GCP) endpoints and a GCP and Globus Connect Server (GCS).

From the user perspective:

Local cluster (GCP) -> GCS: 40-50 MBs

Local cluster (GCP) -> GCP (remote cluster): 8 MBs

Our GCS at MPCDF acts as a staging service, meaning the users need to stage to and then from the server [1]. If possible they would prefer to use GCP-GCP transfers to directly move the data.

The filesizes are not small: 256MB-3GB so I don't expect the Lots of Small Files issue is at play here. Transfers between the two clusters using bbcp achieve similar speeds to the GCP-GCS, so the difference between the routes the transfers are using doesn't seem to be an issue. The project needs to regularly transfer several TBs of data so performance is important to them (and they'd sooner be using Globus than bbcp t.b.h).

Are we wrong in expecting GCP-GCP transfers to achieve similar performance to GCP-GCS? I know different protocols may be in use, but wonder if it's a matter of tuning.

Any hints would be much appreciated.

thanks in advance!
John
[1] - We have created a dedicated Flow to allow them to stage via the GCS server but it's very new and we're just seeing our first users test this.

Ken Carlile

unread,
Sep 12, 2023, 8:34:16 AM9/12/23
to Discuss, john.ken...@gmail.com
In my experience, transfers with GCP as the destination are dramatically slower than GCS. I know there is a way to trick GCP to GCP transfers into going faster (trick is probably not the right word), but it expects certain firewall settings and abilities. 

I'm just hopping on this ticket to put some more pressure on Globus to find a way to improve the speed on GCP receiving transfers, because that's a big pain point for my users as well. 

--Ken

Backeberg, David

unread,
Sep 12, 2023, 1:58:40 PM9/12/23
to john kennedy, dis...@globus.org
Are there some kind of dire security restrictions preventing either cluster from having a GCS endpoint? GCS-to-GCS would be best. Especially if as you say ’this is a regular thing that needs to be done’. That would be good motivation to acquire the gear and effort to build a proper endpoint.

One thing that GCP has to commonly do that GCS does not, is have network traffic traverse a NAT.

If you are going GCP to GCP, you may need to double-traverse a NAT.

So you can blame Globus Connect Personal, but it would be good to try another network protocol and see if you had the same performance there. What you actually might be seeing is how fast your campus hardware can on-the-fly rewrite every packet for a single stream.

Another question is can that remote cluster also hit the same GCS at 40-50 MBs or is the bottleneck actually on the remote cluster? I’ve seen dumb things, like bad wall wiring “10G” copper connection was actually negotiating at 100Mbits/sec. Moved the port over one slot and got the 10G.
--
David Backeberg <david.b...@yale.edu>
(203) 444-7089  Yale Center for Research Computing

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@globus.org.

Ken Carlile

unread,
Sep 12, 2023, 2:48:50 PM9/12/23
to Discuss, david.b...@yale.edu, dis...@globus.org, john.ken...@gmail.com
I suspect it's not the nat traversal that is the problem, but rather the fact when GCP is receiving, it uses Mode S, which is essentially no chunking, no pipelining, no nothing--just 4 (I think) concurrent FTP sessions. That said, the workaround that I am referring to (now that I'm more conscious) is the UDT protocol. Unfortunately I'm not finding good documentation about how to ensure that is running on GCP->GCP transfers. It is possible to enable it on GCS, but firewalls usually prevent its usage, resulting in STUN/ICE errors. 

If both GCP endpoints fall under your subscription, however, you can allow Globus Plus users to change their Network Use settings to increase the concurrency, which may help to some degree. To do so, you need to have the GCP users set their endpoint visibility to Public, at which point, you as the subscription admin can go in and change their Managed status to Yes. This will enable the Edit Network Use button under the Server tab in the web app, which will allow you to set the concurrency (and also the parallelism, but unless the transfer is using UDT or Mode E, that doesn't come into effect). 

(hopes Globus folks don't think I'm overstepping my bounds...)

--Ken

Vas Vasiliadis

unread,
Sep 12, 2023, 3:06:27 PM9/12/23
to Ken Carlile, Discuss, david.b...@yale.edu, john.ken...@gmail.com
This is all very informative dialogue - definitely not overstepping any bounds, Ken. The Globus Connect engineers may jump in here with more (they’re generally very busy people :-) but I would like to reinforce what Dave said below about GCS vs. GCP: if you do have any type of semi-persistent environment/recurring use case, you really should try to invest in a GCS endpoint.

The guiding principle behind GCP is that it should not require the user to configure anything beyond the initial installation. All of the suggestions about network parameters and such are valid, but they require “messin’ with stuff” that otherwise “just works”, and that’s not desirable for the majority of researchers.

That said, I recognize that some organizations have absolute policies that prevent external inbound connections and GCP becomes the only option.

Cheers,
Vas

Michael Link

unread,
Sep 12, 2023, 7:38:59 PM9/12/23
to Discuss, john.ken...@gmail.com, david.b...@yale.edu, Ken Carlile
Hi All,

Ken's point about performance on GCS->GCP transfers is a good one, but
John's transfer in this case must be using UDT, as that is the only
method possible for GCP->GCP transfers.

Dave's suggestion of the possible NAT traversal hit is definitely a
possibility. Another possibility with that same idea in mind is that
the connection is being negotiated on a slower route than expected. The
UDT connection is formed using ICE negotiation, which, in very simple
terms, tries to connect between every combination of interfaces on both
ends. The first one to complete the connection wins, even if that may
not be the primary interface. John mentions this is on a cluster, so
multiple interfaces that could establish a connection wouldn't be
surprising.


If sticking with GCP, there isn't much you can do as far as tuning the
UDT connection. You'll want to investigate the above speculation during
a transfer, to see if the expected interface is being used. One option
is, with a subscription, increasing the concurrency via the network-use*
settings of both GCP endpoints. This should help if the resources
aren't already maxed out. Feel free to follow up with
sup...@globus.org if you need assistance with this.


I'll also echo what others have said about installing GCS if practical.
That will provide the most flexibility and tunability, and also raises
the possibility of using multiple nodes of the clusters.


*https://www.globus.org/subscriber-welcome-kit/best-practices-checklist#run-test-transfers-to-ensure-adequate-performance

Mike
> (203) 444-7089 <tel:(203)%20444-7089>  Yale Center for
Reply all
Reply to author
Forward
0 new messages