Step 7 is a python cli job that checks if any new data appears in the 'inbox'. If so, it grabs the ID from the inbox, then retrieves the manifest from the Synapse web server. If Globus says the transfer is complete, we utilize the standard Dataverse API to import the files as normal, then do clean up. Since the large file is already "on" the Dataverse server (which itself is on the HPC cluster), the importing into Dataverse won't be constrained by bandwidth/connection issues. But this is a partial solution.
We are planning on checking if a file is large, and if so, import a 'dummy' file into the API with the correct metadata, etc. but not the actual file itself. (or perhaps just a truncated version of the file) Once the small dummy is imported, we then replace the Dataverse file with the large file, then go into PostgreSQL and update the file size in the corresponding record.
--
You received this message because you are subscribed to the Google Groups "Dataverse Big Data" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-big-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-big-data/fa931767-5fde-413c-957d-3f9d7405cd06n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Dataverse Big Data" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-big-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-big-data/273f245a-dd4b-42e5-bfb0-09fba7b39ee4n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Dataverse Big Data" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-big-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-big-data/250ddce4-d31f-4494-a0c1-98f6c453e76eo%40googlegroups.com.