Questions related to tesseract 3.03 PDF output feature

264 views
Skip to first unread message

Bruce

unread,
Jun 1, 2014, 12:08:59 PM6/1/14
to tesser...@googlegroups.com
Hello,

I'm testing tesseract's 3.03 PDF output feature on Android.

My question is does the PDF output feature supports compressed JPG files? 

More importantly, what is the requirement for the image file? I tested a bunch of image files from JPG, TIFF to PNG, so far only one particular PNG file that I had successfully created the PDF, all other images I tried failed to render :(

The lack of information regarding this feature doesn't help.. I can't find any document or active discussion related to this topic..

Help!!

I've attached two images, hellokitty.png generates the pdf successfully, while the second png file failed. I am not sure what's the difference between the two.

Thanks in advance :)
hellokitty.png
samplepng.png

Quan Nguyen

unread,
Jun 2, 2014, 11:22:55 PM6/2/14
to tesser...@googlegroups.com
Testing with Tess4J, I had no problem generating searchable PDFs for both images.

I converted a sample TIFF to JPG and was able to create PDF from it as well.

Bruce

unread,
Jun 6, 2014, 10:03:31 PM6/6/14
to tesser...@googlegroups.com
Thanks Nguyen!

As I'm using Tess-Two, an Android jni wrapper for tesseract, maybe the code is a little bit different. Glad to find out that the underlying tesseract supports JPG, TIFF and PNG!

Paul

unread,
Jun 14, 2014, 9:54:11 AM6/14/14
to tesser...@googlegroups.com
I think it supports whatever the underlying leptonica library supports. Tesseract makes use of Leptonica's PIX format internally and all files are read and written using functions defined by Leptonica. When compiling Leptonica you can ignore some file formats that you don't need. You can read more about it in Leptonica's README.

So probably, whoever compiled Tesseract/Leptonica for that JNI wrapper might have dropped support for some file formats, but this is just a guess.

Bruce

unread,
Jun 22, 2014, 4:46:02 AM6/22/14
to tesser...@googlegroups.com
Hi Nguyen,

I manage to get it to work after adding and linking libjpeg to leptonica. I've generated the PDF successfully and it can be opened with the PDF viewer on my android phone, or even browsers on my PC. However when I try to open it using Foxit PDF Reader on my PC it crashes everytime. Is it because of PDF format being used? Can we specify the PDF format that we would like to produce?

Attach a sample output PDF that I generated :) Can someone try to open it and see if it is corrupted?

Thanks again!
output5.pdf

Paul

unread,
Jun 23, 2014, 7:14:25 AM6/23/14
to tesser...@googlegroups.com
Hi Bruce,

it works for me in Foxit 6.2.0.0429 without any error. What is your exact error message?

Paul

Bruce

unread,
Jun 23, 2014, 10:03:06 AM6/23/14
to tesser...@googlegroups.com
Thanks Paul,

I'm using 6.2.0.0429 as well on my PC and it just crash without any error. However using the same version on my company laptop, it doesn't crash and is able to open the pdf file..

Reinstalled Foxit on my PC and it is still the same, I have no idea why :) I guess my PC don't like foxit.

Thanks again!
Bruce

Paul

unread,
Jun 24, 2014, 3:12:40 PM6/24/14
to tesser...@googlegroups.com
Do you use the same OS version on both systems? I use Windows 7 Professional 64 bit on a notebook with an Intel Core i5 Dual Core cpu.
Reply all
Reply to author
Forward
0 new messages