Versioned Bagit Specification

27 views
Skip to first unread message

dan...@verisart.com

unread,
May 18, 2015, 11:55:10 AM5/18/15
to digital-...@googlegroups.com
Hello, I'm wondering if any of you have ever seen or developed a specification on top of the Bagit specification that includes versioning. I'm currently faced with managing a folder of files that can be added to (not removed). Each time the bag changes (by having a file added) we need to create a new manifest, but also retain all previous manifests so we can track the provenance of the bag over time. I was thinking of just stashing the previous manifests within the /data folder and appending a timestamp perhaps, and maybe a running log of versions in one of the tag files, but not sure if something has already been developed.

Thanks,
Dan Riley

Mark A. Matienzo

unread,
May 18, 2015, 12:48:47 PM5/18/15
to digital-...@googlegroups.com
Hi Dan -

This has definitely been a past conversation on the digital-curation Google Group before. Here are some links to past threads:


Hope that helps, at least as a starting point.

Cheers,
Mark

--
Mark A. Matienzo <ma...@matienzo.org>
Director of Technology, Digital Public Library of America

--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.
Visit this group at http://groups.google.com/group/digital-curation.
For more options, visit https://groups.google.com/d/optout.

Ed Summers

unread,
May 18, 2015, 1:12:25 PM5/18/15
to digital-...@googlegroups.com

> On May 18, 2015, at 12:48 PM, Mark A. Matienzo <mark.m...@gmail.com> wrote:
>
> Hi Dan -
>
> This has definitely been a past conversation on the digital-curation Google Group before. Here are some links to past threads:
>
> [0] https://groups.google.com/d/topic/digital-curation/ri-k6idOLRk/discussion
> [1] https://groups.google.com/d/topic/digital-curation/rQVCd8sDGM4/discussion

In the same vein as git-annex that (mentioned at the end of the first discussion) you might be interested in checking out Git Large File Storage (git-lfs) [1], which allows for more efficient blob storage in a Git repo.

Github is rolling git-lfs support out as a service, but it’s an actual git extension which could theoretically be used with other storage systems by implementing their API [2,3].

So, for example, you could theoretically use BagIt to represent the package, but use git to version it. In theory :-)

//Ed

[1] https://git-lfs.github.com/
[2] https://github.com/github/git-lfs/blob/master/docs/spec.md
[3] https://github.com/github/git-lfs/blob/master/docs/api.md
signature.asc

Mark Jordan

unread,
May 18, 2015, 1:18:26 PM5/18/15
to digital-...@googlegroups.com
Dan,

If you just want to version your manifest, take a look at GitBags (https://github.com/mjordan/GitBags), specifically section on "light" GitBags.

Mark


Hello, I'm wondering if any of you have ever seen or developed a specification on top of the Bagit specification that includes versioning. I'm currently faced with managing a folder of files that can be added to (not removed). Each time the bag changes (by having a file added) we need to create a new manifest, but also retain all previous manifests so we can track the provenance of the bag over time. I was thinking of just stashing the previous manifests within the /data folder and appending a timestamp perhaps, and maybe a running log of versions in one of the tag files, but not sure if something has already been developed.

Thanks,
Dan Riley


dan...@verisart.com

unread,
May 18, 2015, 3:53:13 PM5/18/15
to digital-...@googlegroups.com
Great, this kind of stuff was just what I was looking for.

What I like about Bagit is that there are several open source packages for verifying a bagit structure. If I were to use a different way, it would be harder for a third party to verify. So that's why I was gravitating towards a method that was still Bagit compliant, but had some extra versioning info in case. But yes, checking out Moab and these git solutions. The other thing is that this will be stored on S3 so not sure how git repos would work on there (whether the entire repo needs to be downloaded and then reuploaded on a change.

Anyway thanks! 
Reply all
Reply to author
Forward
0 new messages