Hi Tim,
For what it’s worth, in my testing of an AIP for which about 1800 files were normalized, I didn’t run into any timeout issues running headless libreoffice. Albeit on a local VM with no other processes running. And of course your volumes may be higher still. With unconv I did get some random timeouts as discussed in this thread.
In terms of python3, it looks like you could use script type = no shebang needed but add the path to the python3 interpreter, i.e. the shebang (I must confess that’s a new term for me) as the first line. As you note, the python script type has the python2 path hard-coded. From the FPR docs – “No shebang” allows you to write a script in any language as long as the shebang is included as the first line.
https://www.archivematica.org/en/docs/fpr/
Tim
Tim Hutchinson
Archivist, University Archives & Special Collections
University Library, University of Saskatchewan
Tel: (306) 966-1643
Email: tim.hut...@usask.ca
On sabbatical leave, July 2017-June 2018
--
You received this message because you are subscribed to a topic in the Google Groups "Archivematica Tech" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/archivematica-tech/onaG67k3ADY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
archivematica-t...@googlegroups.com.
To post to this group, send email to
archivema...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/archivematica-tech/6ca47be5-67a1-4a44-810c-7994fa393909%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and all its topics, send an email to archivematica-tech+unsub...@googlegroups.com.
To post to this group, send email to archivem...@googlegroups.com.
Hi Tim,
At this point LibreOffice isn’t part of our production instance, but I think that’s the ultimate plan – in the spirit of “good enough” preservation. One of the tradeoffs will be deciding to what extent to automate such normalization. For example we have a bunch of older Word files which convert quite nicely through the Windows command line, but the headers get mangled via archivematica (i.e. linux). So ideally there would be some manual normalization mixed in (that is, normalization independent of Archivematica). But normalizing as many files as possible even with some less than desirable results is preferable to leaving collections in backlog, and of course we have the option of re-ingesting and more customized normalization for collections that need it.
To unsubscribe from this group and all its topics, send an email to archivematica-t...@googlegroups.com.
To post to this group, send email to archivem...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/archivematica-tech/6ca47be5-67a1-4a44-810c-7994fa393909%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Archivematica Tech" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/archivematica-tech/onaG67k3ADY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
archivematica-t...@googlegroups.com.
To post to this group, send email to
archivema...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/archivematica-tech/f357562b-6327-42f1-a89d-96c43d50dcb0%40googlegroups.com.
for i in `seq 1 10`; do libreoffice --headless --invisible --convert-to docx --outdir "%outputDirectory%" "%fileFullName%"
if [ -f "%outputDirectory%%fileName%.docx" ];
then mv "%outputDirectory%%fileName%.docx" "%outputDirectory%%prefix%%fileName%%postfix%.docx"
break
fi
done
The only catch at the moment is that we've had the highest-quality results in testing (for Microsoft formats, especially) with Libreoffice 5, but didn't have any luck implementing that in Archivematica (it kept hanging) so for the moment we are using 4.2.8.2 420m0(Build:2), the version that is in the default Ubuntu apt repository.
Tim
To unsubscribe from this group and all its topics, send an email to archivematica-tech+unsub...@googlegroups.com.
To post to this group, send email to archivem...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/archivematica-tech/6ca47be5-67a1-4a44-810c-7994fa393909%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Archivematica Tech" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/archivematica-tech/onaG67k3ADY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to archivematica-tech+unsub...@googlegroups.com.
To post to this group, send email to archivem...@googlegroups.com.