Hi Zachary,
If the Globus folks aren’t able to find another solution, I can imagine one method, but it’s not perfect, and it would take a fair amount of work.
You could make a Globus service account (a Globus Auth Confidential OAuth client), and give that service account full read to the entire data lake. Once users are authorized to access the data, they can be instructed to make a Guest Collection of their own,
giving write access to your service account. Your service account would then initiate the transfer from your data lake.
The upside is that the users do not need to have access to the data lake. All the transfers would be initiated by your service account, so you can be certain that users won’t see the other files in the data lake.
The downside is that it limits where your users can transfer data to. Here’s what I mean:
• Guest Collections is a feature that requires a Globus subscription, so free users won’t be able to use Globus to download data.
• Folks using Globus Connect Personal will need to be enabled to use premium features, in order to create a guest collection.
• For folks using Globus Connect Server, the collection admin will need to be allowed to create guest collections. At Stanford, for example, our largest compute environment does not allow folks to create guest collections; if they rent storage through our
storage service, they can create guest collections there.
I recently learned that the
ABCD Study is also using Globus as their exclusive method for bulk data transfer to users. An
instructional video is available, though I’m not sure if it uses the method I described above, or a different method.
Please let us know what you decide to do! I expect it will be interesting.