On Feb 21, 2021, at 12:05 PM, Ames, Sasha <am...@llnl.gov> wrote:That would be a really cool student project to write a FUSE file system on top of the Globus API.
Cheers,
I was wondering if there is something or plans for something that would combine the file system - like access of sshfs with the speed of globus. In my use case I have several repositories of 100s of TBs in different locations which I would like to combine into one data analysis platform. Typically each analysis would need only 100s of GB from several of these large repos. Often the favourite ones are used repeatedly. Bringing all data over is prohibitive. A large local cache which holds the favourite ones automatically combined with the speed of globus to fetch those missing would be the solution. I guess this is essentially sshfs but with globus inserted as transfer mechanism instead of scp. Does something like that exist?Thanks - Falk
--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@globus.org.
The student’s FUSE implementation is here: https://github.com/austinbyers/GlobusFS
This was built as part of a class project and it worked ok in small tests. The code was developed about 5 years ago, so it would likely would need some updates. It probably would serve as a good example for future student projects. This student implemented a cache to hide some of the network latency and reduce the number of network calls, but latency was still a problem.
The Whole Tale project (wholetale.org) implements a read-only Globus file system abstraction in the Girder framework (https://github.com/whole-tale/globus_handler). This implementation also makes use of a cache to store files in the Whole Tale environment and make them accessible to running “Tales” (e.g., notebooks hosted in Docker containers).
Kyle