how to debug this globus timeout problem?

2,913 views
Skip to first unread message

Todd Pfaff

unread,
Jul 29, 2021, 12:29:12 AM7/29/21
to Discuss
I'm getting the following timeout error when trying to do a globus ls of a collection.

Can anyone tell me how to go about diagnosing this problem?

$ globus ls 546dbc3a-4e1c-44b1-95e2-3a994355f263
Globus CLI Error: A Transfer API Error Occurred.
HTTP status:      502
request_id:       SSOhxX5sw
code:             ExternalError.DirListingFailed.Timeout
message:
                  Command Failed: Error (connect)
                  Endpoint: McMaster RHPCS globus01 collection (546dbc3a-4e1c-44b1-95e2-3a994355f263)
                  Server: m-70adcf.98410.8443.data.globus.org:443
                  Message: The operation timed out
                 

Dan Powers

unread,
Jul 30, 2021, 2:49:13 PM7/30/21
to Discuss, todd....@gmail.com
Hi Todd,

The most common reasons you'd see a timeout on a directory listing operation against a GCSv5 collection would be:

1) The path against which the listing is being attempted contains so many objects that the timeout window is exceeded before the listing operation can complete. This can also happen if the storage system against which the directory listing operation is attempted is unresponsive.
2) The GCS Manager service on the endpoint is blocked by firewall policy.

In either case, you'll likely want to reach out to the endpoint admin to look into the issue further.

-Dan Powers

Todd Pfaff

unread,
Jul 30, 2021, 3:56:07 PM7/30/21
to Discuss, daniel...@globus.org, Todd Pfaff
Hi Dan,

Thanks for the reply.  I'm the endpoint admin but this is my first experience deploying a Globus Connect Server.  More inline below ...

In this particular case the target path has only one file so it shouldn't be timing out due to any storage performance related issues.

This firewall is more likely, and what I've been exploring, but I haven't found any evidence of this yet.

Is it true that the only ports that need to be open are: 80(TCP), 443(TCP), and 50000-51000(UDP)?  Those are all open but see note 1 below.

It's worth noting that I've made it through all of the various stages of:
- installing GCS,
- creating an endpoint,
- starting GCS,
- successfully doing GCS login,
- creating GCS OIDC server,
- creating GCS posix storage gateway,
- creating GCS collection in that storage gateway,
- connecting to that GCS collection using either the globus.org web portal or using globus CLI on a linux workstation,

yet I am then unable to do an ls or any other access of the collection without experiencing that timeout.


Note 1

I am admittedly trying to do something that may be a bit unusual and is not documented anywhere that I've found thus far.  From an external perspective, my globus endpoint DNS name is aliased to an nginx front-end reverse-proxy / load-balancer server, and the actual globus endpoint server runs behind this front-end.  All ports: 80, 443, 50000-51000 are proxy-passed from nginx to the globus node.  In this manner, everything I've described above works fine.  I only get stuck with the timeout when I try to ls the collection.

Does this give any clues to what may be causing the timeout?

I do know with certainty that the DNS name associated with the collection - that is, this name from my message below:

>                   Server: m-70adcf.98410.8443.data.globus.org:443

is resolveable to the front-end which is passing all ports that I know should be passed.  At this point, I assume that I'm missing some piece of the puzzle in my attempt to get this working through this front-end approach.

Please note, I'm just beginning to test GCS.  I don't really care about performance yet, I'm just trying this approach to avoid having to ask others to open ports for the endpoint node in our campus firewall.  If this approach will simply not work for some reason, I'd really like to understand why, and then I'll move on.  I see no reason that this should not work though as long as the front-end can proxy-pass everything to the back-end and respond appropriately.  It would be helpful to know what exactly is happening during the globus ls process that is timing out.

Thanks,
Todd

Dan Powers

unread,
Jul 30, 2021, 4:28:01 PM7/30/21
to Discuss, todd....@gmail.com, Dan Powers
Hi Todd,

Given the deployment scenario you describe it would probably be best to move this into a ticket so we can look into things further. You can do this at sup...@globus.org.

-Dan Powers

Todd Pfaff

unread,
Jul 30, 2021, 7:23:17 PM7/30/21
to Discuss, daniel...@globus.org, Todd Pfaff
Ok, will do, thanks Dan.  I'll report back here if and when I find a solution.

Bowen Deng

unread,
Nov 7, 2023, 3:59:09 PM11/7/23
to Discuss, todd....@gmail.com
Hi Todd,

I'm running into a similar issue to yours. I'm wondering if you have found out the solution.

Thanks,
Bowen

Joe Bester

unread,
Nov 7, 2023, 7:04:36 PM11/7/23
to Bowen Deng, Discuss, todd....@gmail.com
If you are proxying port 443 you might run into problems, as there is some socket magic going on to pass file descriptors to the GridFTP server from the web front-end.

You can use the

globus-connect-server endpoint modify --gridftp-control-channel-port PORT

command to have the GridFTP server listen on a separate port for the control channel. GridFTP doesn't really support having another process act as the TLS server endpoint, so you'd need to pass the traffic through unmodified to that port.

Joe

Jeffrey Frey

unread,
Nov 7, 2023, 7:35:10 PM11/7/23
to Joe Bester, Bowen Deng, Discuss, todd....@gmail.com
Was it a design choice to force port 443 for the GCS endpoint and not allow that to be configurable, too? My gut says that it reflected the general "pass" given to 443 traffic in ACLs and firewalls, but having to ensure the other port ranges are open end-to-end makes that somewhat moot.

/*!
@signature Jeffrey Frey, Ph.D
@email fr...@udel.edu
@source iPhone
*/

> On Nov 7, 2023, at 19:04, Joe Bester <bes...@globus.org> wrote:
>
> If you are proxying port 443 you might run into problems, as there is some socket magic going on to pass file descriptors to the GridFTP server from the web front-end.
> --
> You received this message because you are subscribed to the Google Groups "Discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@globus.org.
>

Joe Bester

unread,
Nov 9, 2023, 7:20:51 PM11/9/23
to Jeffrey Frey, Bowen Deng, Discuss, todd....@gmail.com
We needed 443 for a few other things in GCS and were hoping to limit the number of special cases in firewalls. Specifically, we need it for the GCS Manager API, Globus OIDC server (if used), and https access to collections.

Joe
Reply all
Reply to author
Forward
0 new messages