Mac DS_store files and Bagit

188 views
Skip to first unread message

Julie Swierczek

unread,
May 8, 2015, 11:01:29 AM5/8/15
to digital-...@googlegroups.com
(I am only an occasional Mac user, so be gentle.)  I’ve run into a problem with using bagit to bundle files for transfer from a Mac. I ran bagit on a folder, and then ran the validation command, and it was invalid because of the DS_Store file. I don’t want to preserve the DS_Store file for posterity anyway.  I've done some searching and found instructions for deleting all DS_Store files - and, of course, hiding them, which is not at all helpful - but I haven't found anything about removing them from one folder and preventing new ones from being created in that folder.

Of course, the easy solution is to move them to another OS - but that is exactly the problem at hand.  I want to bag them before moving them anywhere.  Ideally, I'd like to bag them on the Mac and send them directly to the repository, but the only workaround I can think of is running a checksum on each individual file, moving them to another OS, validating the checksum on each file, bagging them, and then sending them to the repository.  This is less than ideal.

Any suggestions? I can't be the first person who has run into this problem.  Thanks for your help.

Julie

Michael Kjörling

unread,
May 8, 2015, 1:52:04 PM5/8/15
to digital-...@googlegroups.com
On 8 May 2015 06:01 -0700, from juliecs...@gmail.com (Julie Swierczek):
> (I am only an occasional Mac user, so be gentle.) I’ve run into a problem
> with using bagit to bundle files for transfer from a Mac. I ran bagit on a
> folder, and then ran the validation command, and it was invalid because of
> the DS_Store file. I don’t want to preserve the DS_Store file for posterity
> anyway. I've done some searching and found instructions for deleting all
> DS_Store files - and, of course, hiding them, which is not at all helpful -
> but I haven't found anything about removing them from one folder and
> preventing new ones from being created in that folder.

I don't run a Mac at all myself so haven't had any reason to
investigate this, but _Asepsis_ seems to be almost exactly what you
are looking for. From http://asepsis.binaryage.com/:

> Asepsis prevents creation of .DS_Store files. It redirects their
> creation into a special folder.

So while it technically doesn't exactly _prevent_ creating .DS_Store
files _for_ a specific folder, it does _move them out of the specific
folder they are attached to_ which means they won't get included when
you run bagit on that folder (since the file isn't there, but rather
elsewhere; in /usr/local/.dscage, apparently).

Make sure to check OS version compatibility first; the FAQ on that
site has a list.

--
Michael Kjörling • https://michael.kjorling.semic...@kjorling.se
OpenPGP B501AC6429EF4514 https://michael.kjorling.se/public-keys/pgp
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)

L Snider

unread,
May 8, 2015, 2:30:17 PM5/8/15
to digital-...@googlegroups.com
Hi Julie,

I posted this exact same question a while back. Check the archives of the group, so you can see what was posted before. It might help?
https://groups.google.com/forum/#!topic/digital-curation/BNp2OWwbTAQ

Just one comment...You need to be very careful if you remove the DS_Store, it can cause major issues down the road for a Mac-or it might not-just depends.

In the end, my issue with the ds_store went away. It was weird...I did more testing and it didn't cause an issue. No clue why, but now I can do it without issue.

Cheers

Lisa

Lisa Snider
Archivist




--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.
Visit this group at http://groups.google.com/group/digital-curation.
For more options, visit https://groups.google.com/d/optout.

Andrew Berger

unread,
May 8, 2015, 2:39:44 PM5/8/15
to digital-...@googlegroups.com
I've run into this with both .DS_Store and Thumbs.db (on windows) when trying to validate bags made years ago. As I understand it, the files are created when you view the folder using a GUI file manager like Finder. If this is just a one-time step where you're going to create the bag and then immediately send the bag elsewhere, and if you're comfortable with using only the Terminal, you could use the Terminal to:

1. Delete the .DS_Store file(s)
2. Create the bag
3. Validate the bag
4. Send the bag to its destination

However, the moment you or another person opens that folder in Finder, a new .DS_Store file will be created.

I should add the caveat that I never actually moved the bags I was validating. I just did a one-time validation after deleting the .DS_Store and Thumbs.db files and then started monitoring that location with AV Preserve's Fixity tool.

Hope this is helpful,
Andrew


Jarrett Drake

unread,
May 8, 2015, 4:17:02 PM5/8/15
to digital-...@googlegroups.com
You can always use rsync with the exclude option to leave those files behind. That syntax, from the Terminal window, would be:

rsync -ah --progress --stats --exclude=".DS_Store*" [absolute path to source] [absolute path to destination]

After this process finishes, be sure not to open that destination folder with Finder. You can then bag the .DS_Store-free directory and be all set. To make sure you didn't pick up any of them, you could open the manifest and search for DS_Store. I think this should work for you, though.

Good luck,
Jarrett

dan...@verisart.com

unread,
May 18, 2015, 3:53:13 PM5/18/15
to digital-...@googlegroups.com
It sounds like Bagit might need something like git's .gitignore file.

Julie Swierczek

unread,
May 19, 2015, 10:55:47 AM5/19/15
to digital-...@googlegroups.com
Thanks for your responses.  I didn't open the bag between creating it and validating it, so something must be occurring during the creation process where the .DS-Store file is created.  I use bagit-python, and it may be that it uses a function to 'open' the directory during creation at some point.  So, the solution of avoiding opening the directory in Finder doesn't help in my case, since I wasn't doing that anyway.

I'll try using another method for bagging files and, barring that, I'll start exploring some of the other options suggested here and in Lisa's previous thread.

Thanks again.

Julie

daniel shimshoni

unread,
Feb 8, 2016, 11:00:28 AM2/8/16
to Digital Curation
Absolutely. I don't know why it is not in the specification - that, at the very least, hidden files should NOT be included. It is very, very basic. 

Brian Vargas

unread,
Feb 9, 2016, 7:45:09 AM2/9/16
to digital-...@googlegroups.com
The specification intentionally has nothing to say about hidden files. Whether or not to include them is a choice of the tool creating the bag, in the same way that tar chooses to include hidden files when creating a tarball, and git chooses to use .gitignore to selectively exclude files, hidden or otherwise.

Brian

--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.

Ben Fino-Radin

unread,
Feb 9, 2016, 11:12:19 AM2/9/16
to Digital Curation
If your bag contains .DS_store files, and they are in the manifest, this means they were present at the time of bagging. If they are not in the manifest, and your bag does not validate, they were created after the fact by viewing the bag in a GUI.

While .DS_store files are certainly the bane of many of our workflows, I would like to defend BagIt's handling of hidden files. The idea of the exclusion of hidden files being baked into BagIt strikes me as antithetical to preservation. There are numerous reasons why one would want to retain hidden files in materials you are acquiring. 

If you are using the BagIt Python module, it would be trivial for a novice Python programmer to write a custom bagging script that removes .DS_Store files, and any other files you wish to exclude from your bag. In fact, here you go.

To my mind, this is the very purpose of tools like bagit-python, and the implementation of specs in module form: agnostic backbones that allow you you augment and enhance features with your use-cases.

L Snider

unread,
Feb 9, 2016, 1:31:28 PM2/9/16
to digital-...@googlegroups.com
Just to chime in here. If you check the archives, I was asking about Mac add on files like .DS_Store ones and Bagit. In my research I found that it wasn't always a good idea to remove those files, because it could cause issues later. I don't remember the specifics, but I decided to keep them.

Others may have different views on it.

Cheers

Lisa

John Scancella

unread,
Mar 10, 2016, 1:49:22 PM3/10/16
to Digital Curation
This has been fixed in the latest release of the 4.* branch, which you can find here: https://github.com/LibraryOfCongress/bagit-java/releases/tag/v4.12.0


On Friday, May 8, 2015 at 11:01:29 AM UTC-4, Julie Swierczek wrote:
Reply all
Reply to author
Forward
0 new messages