Re: [OpenScienceFramework] OSF todo? enable/foster scriptable transfer to data repositories

Skip to first unread message

Philip Durbin

Feb 27, 2014, 1:36:23 PM2/27/14
Hi Tom,

Dataverse provides a scriptable "Data Deposit API" based on the
SWORDv2 protocol. Here are some examples with curl:

I wrote the API implementation so any bugs are my fault. :)

Our primary use case when developing the API was integration between
Open Journal Systems (OJS) and Dataverse:

COS generously hosted me and my boss back in September (hi, everyone!)
and we're working on an integration between OSF and Dataverse that
makes use of the API:

Actually, COS is even helping to develop a Python library to talk to
the Dataverse API (which I really, really appreciate, not being much
of a Pythonista):

But enough about Dataverse. Lots of other repositories support SWORD.
There's an official list at and my
(slightly longer) list at

But enough about SWORD. Are there other protocols for this? Let me
know! Because some of the stuff we want to do are not covered by the

Hope this helps,


p.s. Cc'ing the Dataverse community list on this.

On Thu, Feb 27, 2014 at 12:21 PM, Tom Roche <> wrote:
>> [Tom Roche Thu, 27 Feb 2014 10:48:23 -0500]
>> Morpho (at least, the previous version) was fairly time-consuming (manually inputing metadata, not to mention data transfer)
> Dunno if this is already in-plan, but one thing I'd like to see OSF tool up (working with providers to enable as necessary) is CLI/scriptable data transfer, esp metadata transfer, to repositories. When attempting to repositorize hundreds (daily for a year, plus spinup) of often-multi-GB netCDF files
> 1. interacting with a GUI or web UI is painful and slow.
> 2. `tar` seems unattractive, since (I suspect)
> * probability of transfer abend grows with {file size, transfer time}, for both uploaders (i.e., me) and downloaders (i.e., collaborators, replicators).
> * downloaders will likely want subsets of the data
> 3. .tar.gz does not help here, since netCDF are already fairly compact binaries.
> Implementation-wise, I'd favor HTTP APIs similar to those already used by BitBucket and GitHub, but only because the clusters on which I work only allow HTTP and SSL out.
> Again, this may require work with the repos to provide necessary plumbing on their side. Along those lines (dunno if this is too off-topic), if anyone has pointers to currently-transfer-scriptable repositories, please pass. I have a proposed question about this @ the proposed Open Science Stack Exchange
> FWIW, Tom Roche <>
> --
> You received this message because you are subscribed to the Google Groups "Open Science Framework" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> For more options, visit

Philip Durbin
Software Developer for
Reply all
Reply to author
0 new messages