Need unbiased recommendation

D A

unread,

Mar 30, 2024, 8:44:39 AMMar 30

to dis...@globus.org

Hello All:

We are thinking of various options for our organization. As a
community of globus users and administrators, would really appreciate
if you can provide your experience in the following. Please DM me if
you need additional information or for any
recommendations/suggestions.

We are particularly interested in evaluating pros and cons:

the maintenance overhead of running a Globus Server instance.
the UX from the perspective of customers who could use Globus to
transfer data to us (and vice-versa).
How it integrates with S3
How access control is managed, e.g. how it integrates with OIDC
providers and how permissions are managed.

Appreciate all your help.

Regards

Deven

Lev Gorenstein

unread,

Apr 1, 2024, 11:56:50 AMApr 1

to D A, dis...@globus.org

Deven,

I am not unbiased as I work for Globus :) But at $JOB-1 at Purdue I was a "Globus guy" and a point of contact for all Globus-related questions, so I'll share my opinion from those days.

Maintenance overhead of running a Globus Connect Server instance: very minimal.
Of all the things we had to do on research computing clusters, Globus endpoints were the least demanding, and once installed and configured, they just worked (TM). The v4->v5 migration was a bit of a complication (because of the massive nature of the changes), but even that one-time event went smoothly.
Also along maintenance lines - Globus support team is excellent. Never in my admin/user/evangelist shoes have I not received a detailed, quick and informative response to any of my weird questions.

UX from the user's standpoint: very good.
May take a little getting used to (because we all know that the only truly intuitive interface is the nipple, and after that it's all learned) - but once seen and accustomed to the terminology, users of all backgrounds and computer literacy had no issues with it. And you have a choice of using web GUI, scriptable CLI tool, or an SDK to build your own apps that leverage Globus.
S3 integration: very good.
Globus for S3 connector is a premium feature and an add-on to the subscription. But it works very well (just like other cloud storage connectors). Note also that Globus abstracts different types of storage into a unified interface... so a user may not even know that they are talking to an S3 or other storage system.

It is based on OIDC and integrates with your IdP. Access controls can be defined on multiple levels (endpoints, collections, individual subdirectories inside collections). Too much to describe in a short bullet point, but - yes, there are controls and ACLs, they are very flexible, and can be tweaked and tuned for your specific needs.

Please feel free to reach out off list if you have any specific questions!

Lev

Kaufman, Ian

unread,

Apr 1, 2024, 12:04:51 PMApr 1

to Lev Gorenstein, D A, dis...@globus.org

As a customer, I can only second what Lev has stated.

In general, Globus Connect Server is easy to maintain (the major upgrade from GCSv4 to GCSv5 would have been difficult were it not for the planning by the Globus team, the availability of tools to make it much easier, and the responsiveness of the Globus support team).

The S3 connector is really easy to use once it is set up. As lev mentioned, it is transparent to the end user as far as talking to an endpoint backed by CIFS, NFS, S3, or Google Drive (those are the only four I have dealt with, but I suspect any other connectors provide the same experience).

Access Control is very flexible and can get very granular. I have yet to see an issue that it cannot handle.

Ian

Ian Kaufman
Systems Integration Engineer

UC San Diego, Research IT Services

ikaufman AT ucsd DOT edu

From: Lev Gorenstein <l...@globus.org>
Sent: Monday, April 1, 2024 8:56 AM
To: D A <dat...@gmail.com>
Cc: dis...@globus.org <dis...@globus.org>
Subject: Re: [Globus Discuss] Need unbiased recommendation

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@globus.org.

Ken Carlile

unread,

Apr 1, 2024, 1:22:35 PMApr 1

to Discuss, ikau...@ucsd.edu, dis...@globus.org, l...@globus.org, dat...@gmail.com

I agree with what the others have said, although setting up GCS is definitely non-trivial, particularly when it comes to firewall rules. Once it's running, it's pretty minimal in terms of maintenance, but I wouldn't expect to get it up and running in a single afternoon.

--Ken

D A

unread,

Apr 1, 2024, 1:42:27 PMApr 1

to Ken Carlile, Discuss, ikau...@ucsd.edu, l...@globus.org

Ken, Ian and Lev:

Appreciate all the inputs and your feedback. Really good to hear from
you, and glad to hear so much positive feedback.

Just for completeness of the story - can someone comment on the following:

Learning Curve: - specifically for new users
Limited User Support: - Globus does have paid support option - would
like to understand what is included in Free support, and what do we
have to pay for.
Potential Performance Overhead: - especially large files, over low
bandwidth (what do we need to worry about from transfer speeds)
Security Considerations: - What are the gotchas when it comes to
configuring for security (probably expand on what Ken said)
Integration Complexity: - same question for integration security
(probably expand on what Ken said).

Thank you again,

Regards

Deven

Scott Ruffner

unread,

Apr 1, 2024, 3:01:14 PMApr 1

to Lev Gorenstein, D A, dis...@globus.org

I do not work for Globus but I do have a quite positive opinion; some comments in-line:

On Mon, Apr 1, 2024 at 11:56 AM Lev Gorenstein <l...@globus.org> wrote:

Maintenance overhead of running a Globus Connect Server instance: very minimal.
Of all the things we had to do on research computing clusters, Globus endpoints were the least demanding, and once installed and configured, they just worked (TM). The v4->v5 migration was a bit of a complication (because of the massive nature of the changes), but even that one-time event went smoothly.

Yes, our experience as well.

Also along maintenance lines - Globus support team is excellent. Never in my admin/user/evangelist shoes have I not received a detailed, quick and informative response to any of my weird questions.

+1 including getting a hotfix for a connector.

UX from the user's standpoint: very good.

I quibble with this somewhat - I don't really understand it/why, but this (to me) very simple web interface seems to be challenging for our users. I suspect this has to do with desiring a more drag-and-drop "file manager" experience. We still have people who are trying to use Filezilla. However, the interface is good and after people get the hang of it, they seem happy. The entire Globus scheme/nomenclature does take some getting used to, but once people get the hang of it, they are always happy with how powerful it is.

It is based on OIDC and integrates with your IdP. Access controls can be defined on multiple levels (endpoints, collections, individual subdirectories inside collections). Too much to describe in a short bullet point, but - yes, there are controls and ACLs, they are very flexible, and can be tweaked and tuned for your specific needs.

We have yet to find a use case we were unable to address with V5.x.

I'm an evangelist.

Rick Wagner

unread,

Apr 9, 2024, 5:19:52 PMApr 9

to Discuss, prescott...@gmail.com, dat...@gmail.com, l...@globus.org

Hi Deven,

I wanted to share some thoughts on this topic since I’ve helped several groups onboard to Globus for both institutional and research project use. I also have an ulterior motive, because there are some parts of getting started that I think are better served by the community rather than the Globus team. I’ll lay them out in my reply below and encourage anyone who feels the same to talk to me at GlobusWorld next month (Karl, I’m hoping you’ll be there).

And in full disclosure, I used to work for Globus, so I’m at least as biased as Lev. Although I also recognize that other factors may impact the choice on whether or not to go with Globus.

the maintenance overhead of running a Globus Server instance.

As others have written, the maintenance and operations aspect of a GCS instance is low—provided you dedicate the servers, containers, etc., to just being DTNs. If you start adding other applications or capabilities to those systems, you’re asking for headaches. This is a chance for you to have at least one piece of functionality that is not tightly coupled to other components, take advantage of it. You can even pause activity on the endpoint to handle file system outages or maintenance.

However, setting up a GCS instance can take some effort and planning. This is where I think the community’s shared experience could make a big difference. I can usually bring up a basic Globus collection (POSIX or S3) in less than hour. That’s because I understand how things like the storage gateways and identity mapping work. But I have also spent hours and opened tickets dealing with particular issues because every system is different.

I would like to see the community share at least a couple of things to help reduce the up-front costs:
Sample configurations for GCS components, particularly identity mapping. The core Globus documentation can only cover some of these.
A single document with a table of all the GCS components and the parameters for each. Those parameters map to policy decisions. For example, knowing where you allow or limit access to parts of the file system is important.

Rather than expect the Globus documentation to cover all of the potential use cases (and test them before updates) we could contribute this as a community.

the UX from the perspective of customers who could use Globus to transfer data to us (and vice-versa).

This hasn’t been a problem, provided that the projects and users understand that Globus does not provide a mount for the data. I generally ask several questions to make sure that they’re not looking for Google Drive-like access to their data, or that they understand what can be provided. In other words, make sure you’re offering the right solution.

How it integrates with S3

Very well. You’ll need to understand how to configure the identity mapping portion, of course. I’ve helped projects migrate data to and from S3 routinely.

How access control is managed, e.g. how it integrates with OIDC providers and how permissions are managed.

Overall, any issues here aren’t with Globus, it’s dealing with the idiosyncrasies of the storage system and the identity providers. For example, UCSD sends out to the world a username which is a long opaque string, and is impossible (on purpose) to map to something useful like “rpwagner”, my typical username. So we’ve developed trusted ways to map the UUID of our identities to the accounts on the storage systems. Experiences like these are why I think we could help each other with some shared knowledge.

Outside of those difficulties, I have not found a single access control policy that could not be implemented at some level.

—Rick

D A

unread,

Apr 9, 2024, 5:43:24 PMApr 9

to Wagner, Rick, Lev Gorenstein, 'Patrick Mulrooney' via Discuss, Scott Ruffner

Thank you, Rick, for a detailed response to the questions.

I appreciate taking time to respond to each of my questions. You and
the community have been extremely generous in providing answers to the
questions I had.

Do you know of anyone who is working in commercial settings, to talk
about what their experiences are?

However, this information has been extremely helpful,

Regards

Deven

On Tue, Apr 9, 2024 at 5:18 PM Wagner, Rick <ri...@sdsc.edu> wrote:
>
> Hi Deven,
>
> I wanted to share some thoughts on this topic since I’ve helped several groups onboard to Globus for both institutional and research project use. I also have an ulterior motive, because there are some parts of getting started that I think are better served by the community rather than the Globus team. I’ll lay them out in my reply below and encourage anyone who feels the same to talk to me at GlobusWorld next month (Karl, I’m hoping you’ll be there).
>
> And in full disclosure, I used to work for Globus, so I’m at least as biased as Lev. Although I also recognize that other factors may impact the choice on whether or not to go with Globus.
>
> the maintenance overhead of running a Globus Server instance.
>
> As others have written, the maintenance and operations aspect of a GCS instance is low—provided you dedicate the servers, containers, etc., to just being DTNs. If you start adding other applications or capabilities to those systems, you’re asking for headaches. This is a chance for you to have at least one piece of functionality that is not tightly coupled to other components, take advantage of it. You can even pause activity on the endpoint to handle file system outages or maintenance.
>
> However, setting up a GCS instance can take some effort and planning. This is where I think the community’s shared experience could make a big difference. I can usually bring up a basic Globus collection (POSIX or S3) in less than hour. That’s because I understand how things like the storage gateways and identity mapping work. But I have also spent hours and opened tickets dealing with particular issues because every system is different.
>
> I would like to see the community share at least a couple of things to help reduce the up-front costs:
>
> Sample configurations for GCS components, particularly identity mapping. The core Globus documentation can only cover some of these.
> A single document with a table of all the GCS components and the parameters for each. Those parameters map to policy decisions. For example, knowing where you allow or limit access to parts of the file system is important.
>
>
> Rather than expect the Globus documentation to cover all of the potential use cases (and test them before updates) we could contribute this as a community.
>

> the UX from the perspective of customers who could use Globus to transfer data to us (and vice-versa).
>

> This hasn’t been a problem, provided that the projects and users understand that Globus does not provide a mount for the data. I generally ask several questions to make sure that they’re not looking for Google Drive-like access to their data, or that they understand what can be provided. In other words, make sure you’re offering the right solution.
>

> How it integrates with S3
>

> Very well. You’ll need to understand how to configure the identity mapping portion, of course. I’ve helped projects migrate data to and from S3 routinely.
>

> How access control is managed, e.g. how it integrates with OIDC providers and how permissions are managed.
>

> Overall, any issues here aren’t with Globus, it’s dealing with the idiosyncrasies of the storage system and the identity providers. For example, UCSD sends out to the world a username which is a long opaque string, and is impossible (on purpose) to map to something useful like “rpwagner”, my typical username. So we’ve developed trusted ways to map the UUID of our identities to the accounts on the storage systems. Experiences like these are why I think we could help each other with some shared knowledge.
>
> Outside of those difficulties, I have not found a single access control policy that could not be implemented at some level.
>
> —Rick
>

Rick Wagner

unread,

Apr 9, 2024, 5:50:25 PMApr 9

to D A, Lev Gorenstein, 'Patrick Mulrooney' via Discuss, Scott Ruffner

Hi Deven,

> Do you know of anyone who is working in commercial settings, to talk
> about what their experiences are?

The only example I’ve worked with recently is for a company retrieving data from an instrument on our campus. That seems to be a somewhat common use case, where the company uses Globus just enough to retrieve the data.

Your experience may depend quite a bit on the types of external organizations you’re interacting with. Globus is widely adopted within the US research and education community, and has less adoption for purely commercial use.

This is something the Globus customer team, like Lev and Greg, can probably contribute to more.

—Rick

Reply all

Reply to author

Forward