Improvement to Qubes PDF Converter

41 views
Skip to first unread message

Micah Lee

unread,
Jul 13, 2017, 11:51:16 AM7/13/17
to qubes...@googlegroups.com
The Qubes PDF Converter is excellent, but there are two annoying things
about the final trusted PDF: the file size can be enormous, and you lose
the text layer.

For example, I have a 133 page PDF that doesn't contain any images, but
has lots of text. The original file is 2.1mb. After I convert it into a
trusted PDF, the final trusted PDF is 40.9mb. (And none of the text is
searchable.)

If I use a tool called shrinkpdf [1], which is just a simple wrapper
around ghostscript, I can reduce that filesize to 23.3mb. Since the
40.9mb PDF at this point is already trusted, there's no danger is
running it through gs without using another dispvm. I think it would be
great if this step (or something similar) were built-in to Qubes PDF
Converter, so that all final trusted PDFs are compressed.

At some point in the future, it would also be awesome if the final
trusted PDF could be fed through something like tesseract-ocr to OCR it
and add a text layer back to the PDF. But I think that's a bigger
project, and compressing the PDF would be a nice first addition.

[1] http://www.alfredklomp.com/programming/shrinkpdf/
Reply all
Reply to author
Forward
0 new messages