Generating PREMIS xml file via command line

68 views
Skip to first unread message

Theo Wilderbeek

unread,
Dec 19, 2016, 12:58:00 PM12/19/16
to BitCurator Users
Hi, I am experimenting different options to get the BitCurator Reports without using the GUI. I would like to know if there is any possibility to get the PREMIS xml file by a command line, because I am not able to find any python file related to this function.

Thank you

Kam Woods

unread,
Dec 19, 2016, 2:59:49 PM12/19/16
to bitcurat...@googlegroups.com
The script that supports this functionality is "bc_genrep_premis.py". It can be run from the command line, but it is not currently installed as a script by the setuptools installer. You can run locate to find it in the bctools package in BitCurator, or download a copy from github.

You can run and view the help options with:


This script hasn't been updated in some time. If you're interested in improving it, or updating the installer, feel free to send a pull request to the distro-tools repository!

Kam

On Mon, Dec 19, 2016 at 12:58 PM, Theo Wilderbeek <theo.wi...@gmail.com> wrote:
Hi, I am experimenting different options to get the BitCurator Reports without using the GUI. I would like to know if there is any possibility to get the PREMIS xml file by a command line, because I am not able to find any python file related to this function.

Thank you

--
You received this message because you are subscribed to the Google Groups "BitCurator Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcurator-users+unsubscribe@googlegroups.com.
To post to this group, send email to bitcurator-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcurator-users/e88c2d5f-3237-445f-ad61-a28547eb6f9d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kam Woods

unread,
Dec 19, 2016, 3:02:38 PM12/19/16
to bitcurat...@googlegroups.com
Sorry, that should have been:
You can run it and view help options with: 

python3 bc_premis_genrep.py -h

from wherever you have the python script. The help output should look like this:

bcadmin@ubuntu:~/Desktop$ python3 bc_genrep_premis.py -h
usage: bc_premis_genxml.py [-h] [--dfxmlfile DFXMLFILE]
                           [--bulk_extractor BULK_EXTRACTOR]
                           [--Allreports ALLREPORTS]
                           [--premis_file PREMIS_FILE]

Generate PREMIS XML file for BitCurator events

optional arguments:
  -h, --help            show this help message and exit
  --dfxmlfile DFXMLFILE
                        DFXML file
  --bulk_extractor BULK_EXTRACTOR
                        Bulk-extrator Report file
  --Allreports ALLREPORTS
                        All Reports
  --premis_file PREMIS_FILE
                        Output Premis File; Concatinates if exists

Theo Wilderbeek

unread,
Dec 20, 2016, 3:35:10 AM12/20/16
to BitCurator Users
Thank you. I tried the script, but I am getting this error:

Traceback (most recent call last):
  File "bc_genrep_premis.py", line 244, in <module>
    premis.bcGenPremisXmlFiwalk(args.dfxmlfile, args.premis_file)
  File "bc_genrep_premis.py", line 184, in bcGenPremisXmlFiwalk
    self.bcGenPremisEvent(root, eventIdType, "File System Analysis", eventDetail, eDateTime, eOutcome, eoDetail, of_premis, fw_tab)
  File "bc_genrep_premis.py", line 65, in bcGenPremisEvent
    event = etree.SubElement(root, 'event')
TypeError: Argument '_parent' has incorrect type (expected lxml.etree._Element, got str)

I am using a fiwalk xml file previously created, and a report.xml from bulk_extractor previously created

Kam Woods

unread,
Dec 20, 2016, 10:50:21 AM12/20/16
to bitcurat...@googlegroups.com
The premis script I referenced also requires you to have the full BitCurator reports generated. From the command line, you can do this by annotating the b_e output (generated by identify_filenames.py; in BitCurator this is located in /usr/share/dfxml/python) and running generate_report.py (from anywhere). The report generator requires the fiwalk XML file, the annotated feature directory, and a new output directory. This new output directory can then be used as the import for the "Allreports" flag in the premis script.

These scripts have not been updated in a very long time, and could definitely use a rewrite. If you're interested in contributing (either conceptual comments or code), opening an issue in the appropriate git repository would probably be the best way to ensure this stays on the radar for the Consortium.

Kam



--
You received this message because you are subscribed to the Google Groups "BitCurator Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcurator-users+unsubscribe@googlegroups.com.
To post to this group, send email to bitcurator-users@googlegroups.com.

Theo Wilderbeek

unread,
Dec 22, 2016, 11:53:18 AM12/22/16
to BitCurator Users
Thanks, I tried that but the same errors persist. I am using a fiwalk xml file, a bulk_extractor xml file and the reports folder. What puzzles me is that the Python script bc_genrep_premis.py was not originally in my hard drive, and I had to download it from the bitcurator-distro-tools packet, located in https://github.com/bitcurator/bitcurator-distro-tools. My main target is to know if it is possible to generate all the files from BitCurator Reports with command lines without the use of the GUI.


If I get some success, I will open an issue in the appropriate git repository. Thanks!

Kam Woods

unread,
Dec 22, 2016, 12:11:55 PM12/22/16
to bitcurat...@googlegroups.com
Then be unpuzzled: bc_genrep_premis.py is installed as a py_module (that is, copied to the appropriate site-packages directory, you can locate it on any BitCurator system by typing "locate bc_genrep_premis.py"), but is not exposed as an executable script.

This could, of course, be changed in future releases. As I said, many of these tools need an update.

Kam

On Thu, Dec 22, 2016 at 11:53 AM, Theo Wilderbeek <theo.wi...@gmail.com> wrote:
Thanks, I tried that but the same errors persist. I am using a fiwalk xml file, a bulk_extractor xml file and the reports folder. What puzzles me is that the Python script bc_genrep_premis.py was not originally in my hard drive, and I had to download it from the bitcurator-distro-tools packet, located in https://github.com/bitcurator/bitcurator-distro-tools. My main target is to know if it is possible to generate all the files from BitCurator Reports with command lines without the use of the GUI.


If I get some success, I will open an issue in the appropriate git repository. Thanks!

--
You received this message because you are subscribed to the Google Groups "BitCurator Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcurator-users+unsubscribe@googlegroups.com.
To post to this group, send email to bitcurator-users@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages