Bulk Extractor and alert lists to flag sensitive words

36 views
Skip to first unread message

Eira Tansey

unread,
Sep 22, 2015, 2:33:46 PM9/22/15
to Digital Curation
Hi list:

I have a group of records I'm working with that I know contain some sensitive files (e.g., job candidate evaluations). I want to identify these and segregate them from the rest of the accession. I am using Bulk Extractor (within BitCurator), and tried to create an alert list text file to flag sensitive words and phrases (see pages 27-28: http://digitalcorpora.org/downloads/bulk_extractor/BEUsersManual.pdf).

However, passing this alert list against a directory of files does not seem to result in any output, even though I know these words appear in at least some of the text documents. The built-in scanners are working fine and output things like phone numbers. Does anyone have advice? Alternative suggestions re: tools/methods for identifying documents with sensitive keywords would also be welcome.

Thanks,

 

Eira Tansey
Digital Archivist/Records Manager

Archives and Rare Books Library
University of Cincinnati Libraries
806 Blegen Library
2602 McMicken Circle
PO Box 210113
Cincinnati, OH 45221-0113

Direct Tel: 513-556-1958
Library Tel: 513-556-1959
Email:
eira....@uc.edu
Web:
www.libraries.uc.edu/libraries/arb/


Reply all
Reply to author
Forward
0 new messages