Freezing Ocropus and genpdf for Decapod 0.5 November 9

3 views
Skip to first unread message

Jonathan Hung

unread,
Oct 31, 2011, 1:24:27 PM10/31/11
to dec...@googlegroups.com, Hasan Al-Khaffaf
Hi Hasan and everyone.

We are going to cut a release for Decapod 0.5 on Friday November 18. We would like to freeze on genpdf and Ocropus so we have a stable code base for release testing. We are proposing Wednesday November 9 (next Wednesday) to freeze genpdf and Ocropus. Is this reasonable?

Ocropus version: at the IDRC, we have been using Ocropus 0.4.4 (the same version that was shipped with Decapod 0.4). Should we continue using this for the release? Hasan, is this the same version you're using for your development?

genpdf: The goal is to have really good Type 1 PDF generation. Type 2 PDF support is less of a priority, but it would be good if we can demonstrate its potential in this release. More specifically:

Type 1 PDF:
1. take multiple images and produce a single PDF. (See for more detail: http://issues.fluidproject.org/browse/DECA-187)
2. image quality is faithful to the original (i..e fix colour inversion issue: http://issues.fluidproject.org/browse/DECA-157)

Type 2 PDF:
3. reasonably well scanned / photographed documents should produce good Type 2 results (see http://issues.fluidproject.org/browse/DECA-188)


The above issues are listed in priority.

Is this possible to accomplish by next Wednesday? Of the 3 issues identified, only #1 is critical. The #2 and #3 are important, but not fixing them would still allow the user to create PDFs using genpdf (although the results may not be as nice).

- Jonathan.

---
Jonathan Hung / jh...@ocad.ca
IDRC - Interaction Designer / Researcher
Tel: (416) 977-6000 x3959
Fax: (416) 977-9844

Hasan Al-Khaffaf

unread,
Nov 1, 2011, 4:42:39 PM11/1/11
to Jonathan Hung, dec...@googlegroups.com, Hasan Al-Khaffaf
Hi.
See my answers below:

On Mon, Oct 31, 2011 at 6:24 PM, Jonathan Hung <jh...@ocad.ca> wrote:
Hi Hasan and everyone.

We are going to cut a release for Decapod 0.5 on Friday November 18. We would like to freeze on genpdf and Ocropus so we have a stable code base for release testing. We are proposing Wednesday November 9 (next Wednesday) to freeze genpdf and Ocropus. Is this reasonable?

Ocropus version: at the IDRC, we have been using Ocropus 0.4.4 (the same version that was shipped with Decapod 0.4). Should we continue using this for the release? Hasan, is this the same version you're using for your development?
Yes, I am currently using Ocropus 0.4.4 for working/testing purposes.
 
genpdf: The goal is to have really good Type 1 PDF generation. Type 2 PDF support is less of a priority, but it would be good if we can demonstrate its potential in this release. More specifically:

Type 1 PDF:
1. take multiple images and produce a single PDF. (See for more detail: http://issues.fluidproject.org/browse/DECA-187)
This point can be done.
2. image quality is faithful to the original (i..e fix colour inversion issue: http://issues.fluidproject.org/browse/DECA-157)
I have checked DECA-157, It seems that Ocropus got into trouble when converting this image from jpg to png.
 
Type 2 PDF:
3. reasonably well scanned / photographed documents should produce good Type 2 results (see http://issues.fluidproject.org/browse/DECA-188)
This point is rather complex to be corrected soon, because it is more related to Ocropus than to genpdf.

Jonathan Hung

unread,
Nov 2, 2011, 2:45:30 PM11/2/11
to dec...@googlegroups.com
Hi Hasan,

This sounds good to me. Am I correct to assume you will be taking on these tasks?

I guess we'll bring up the Ocropus ocr issue with Thomas after this release. I don't think we should change ocropus at this point to improve OCR.

- Jon.


--
You received this message because you are subscribed to the Google Groups "Decapod" group.
To post to this group, send email to dec...@googlegroups.com.
To unsubscribe from this group, send email to decapod+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/decapod?hl=en.

Hasan Al-Khaffaf

unread,
Nov 3, 2011, 3:32:28 PM11/3/11
to dec...@googlegroups.com, Hasan Al-Khaffaf
Hi Jon

I have already uploaded the latest version of decapod-genpdf.py which is now able to produce a single PDF file from a set of input files OR from one input folder.

For the other issues, some are related to Ocropus which is not in my work scope. There are also some issues that are related to both Ocropus and genpdf and am currently investigating it.

Hasan

Jonathan Hung

unread,
Nov 11, 2011, 10:59:23 AM11/11/11
to dec...@googlegroups.com
Re: DECA-157, images being colour inverted

Following your email, I did a test where I converted the offending JPEG to PNG first using imagemagick, and it appears the PNG causes the same problem. But when I convert to TIFF first then run genpdf, the PDF comes out not inverted.

So I guess a temporary work-around would be to convert to TIFF first.

Hasan Al-Khaffaf

unread,
Nov 11, 2011, 2:49:18 PM11/11/11
to dec...@googlegroups.com
Is this means that the conversion to TIFF will be added to your code??

Hasan

Jonathan Hung

unread,
Nov 11, 2011, 3:38:41 PM11/11/11
to dec...@googlegroups.com
Yes, conversion to TIFF will be added to the application code since it's not a bug with genpdf.

- Jon.
Reply all
Reply to author
Forward
0 new messages