Feedback Req: Bash Tool for Accessioning

95 views
Skip to first unread message

Powell, Kevin

unread,
Aug 9, 2016, 6:03:25 PM8/9/16
to digital-...@googlegroups.com
Sorry for the repost. Someone pointed out that attaching a shell script to an email might get it flagged as spam. Here's my original post with a GitHub link at the bottom. 

----------------------------------------------------------------------

Hi Everyone,
 
I'm putting together a Bash script for our archives staff that creates some reports about a directory and its subdirectories. It's still very much in development and I'm definitely a novice at this stuff (shoutout to CRALS for jumpstarting my imagination), but the community's feedback would be much appreciated. 

How it (currently) works: 
  • Open Terminal.
  • Navigate to the directory where you want to save the reports.
  • ./path/to/createreports.sh -i path/to/targetDir
  • ClamScan runs first. If infected files are found with ClamScan, you are asked whether you want to exit the program or continue running reports.
  • Flags -b or -f after path/to/targetDir generate Bulk Extractor and/or FITS reports.
  • All script output is saved in the current working directory in a subdirectory called targetDir_reports.
The script generates four files by default:
  • targetDir_fileTypes.csv: CSV of file extension counts (2 TXT, 10 PDF, etc). 
  • targetDir_manfiest.txt: Manifest of files in targetDir with file paths, file sizes, and Unix permissions.
  • targetDir_siegfried.csv: CSV output of Siegfried scan.
  • targetDir_virusCheck.txt: Output from ClamAV; prints list of infected files (if found) and scan summary. 
If Bulk Extractor and/or FITS are initiated, the outputs are saved in subdirectories targetDir_reports/targetDir_BulkExtand targetDir_reports/targetDir_FITS.

The tool bundles commands for 4 separate open source tools: ClamAV, Siegfried, FITS, and Bulk Extractor. The ClamScan and Siegfried commands run automatically, but FITS and Bulk Extractor are initiated with option flags. This works alright if I deploy the script locally at Brown - I can install each tool myself manually - but I'm interested in sharing this script with others.

I'm not sure if I'm reinventing the wheel, here. I just wanted something simple that could generate information about an accession folder... we aren't going the full Archivematica route just yet.

The script can be found on GitHub if you're interested in taking a closer look. 

Thanks, 

Kevin Powell
Digital Preservation Librarian
Sciences Library, Box I
Brown University
Providence, RI, 02906

Powell, Kevin

unread,
Aug 23, 2016, 9:08:38 PM8/23/16
to digital-...@googlegroups.com
Hi Everyone,

A quick update. After posting in this forum, someone informed me of Timothy Walsh's Brunnhilde, which took my reporting tool a few steps further. Brunnhilde uses Python and SQLite to generate helpful reports based on a Siegfried scan. I added ClamScan and Bulk Extractor options to the tool, which Tim incorporated into the recently released 1.0.0 version. Thank you once again, digital preservation community, for your spirit of openness and collaboration. 


Kevin Powell
Digital Preservation Librarian
Sciences Library, Box I
Brown University
Providence, RI, 02906


Matt Schultz

unread,
Aug 29, 2016, 11:49:12 PM8/29/16
to digital-...@googlegroups.com
Works like a charm! Just what we needed here in our Library. Thanks to you all for your work on this.
— 
Matt Schultz

Metadata & Digital Curation Librarian

Grand Valley State University Libraries



--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.
Visit this group at https://groups.google.com/group/digital-curation.
For more options, visit https://groups.google.com/d/optout.

Jarrett Drake

unread,
Sep 2, 2016, 1:54:15 PM9/2/16
to digital-...@googlegroups.com
This is an excellent tool; both Kevin's and Timothy's. Thanks for sharing with the community!

To unsubscribe from this group and stop receiving emails from it, send an email to digital-curation+unsubscribe@googlegroups.com.
To post to this group, send email to digital-curation@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curation+unsubscribe@googlegroups.com.
To post to this group, send email to digital-curation@googlegroups.com.

Timothy Walsh

unread,
Sep 6, 2016, 4:27:39 PM9/6/16
to Digital Curation
I'm glad people are finding Brunnhilde helpful! Thank you, Kevin, for your great additions.

FYI, there is now a Brunnhilde GUI. It's designed to be especially easy to install/use in Bitcurator but should work on any Linux or Mac machine. The GUI requires v1.1.0 of brunnhilde.py, which can be found here.

Tim

---
Tim Walsh
Digital Archivist
Canadian Centre for Architecture
twa...@cca.qc.ca
Twitter: @bitarchivist
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages